from:"Ishan Chattopadhyaya"

[jira] [Assigned] (SOLR-12546) CSVResponseWriter doesnt return non-stored field even when docValues is enabled, when no explicit fl specified

2018-11-14 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-12546:
---

Assignee: (was: Ishan Chattopadhyaya)

> CSVResponseWriter doesnt return non-stored field even when docValues is 
> enabled, when no explicit fl specified
> --
>
> Key: SOLR-12546
> URL: https://issues.apache.org/jira/browse/SOLR-12546
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.2.1
>Reporter: Karthik S
>Priority: Major
> Fix For: 7.2.2
>
> Attachments: SOLR-12546.patch, SOLR-12546.patch
>
>
> As part of this Jira SOLR-2970,  CSVResponseWriter doesn't return fields 
> whose stored attribute set to false, but doesnt consider docvalues.
>  
> Causing fields whose stored=false and docValues =true are not returned when 
> no explicit fl are specified. Behavior must be same as of json/xml response 
> writer..
>  
> Eg:
> -  Created collection with below fields
>  type="string"/> 
>  type="int" stored="false"/>
>  type="plong" stored="false"/>
> 
>  precisionStep="0"/>
> 
>  
>  
> -  Added few documents
> contentid,testint,testlong
> id,1,56
> id2,2,66
>  
> -  http://machine:port/solr/testdocvalue/select?q=*:*=json
> [\{"contentid":"id","_version_":1605281886069850112,
> "timestamp":"2018-07-06T22:28:25.335Z","testint":1,
> "testlong":56},
>   {
> "contentid":"id2","_version_":1605281886075092992,
> "timestamp":"2018-07-06T22:28:25.335Z","testint":2,
> "testlong":66}]
>  
> -  http://machine:port/solr/testdocvalue/select?q=*:*=csv
> "_version_",contentid,timestamp1605281886069850112,id,2018-07-06T22:28:25.335Z1605281886075092992,id2,2018-07-06T22:28:25.335Z
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12791) Add Metrics reporting for AuthenticationPlugin

2018-10-03 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637205#comment-16637205
 ] 

Ishan Chattopadhyaya commented on SOLR-12791:
-

Hi [~janhoy], I'll be able to get to this only by the coming Monday, 8th 
October. I'll keep it on the top of my list of things to look at. Sorry for the 
delay; feel free to proceed if Andrzej gets to it before me. Thanks for your 
work on this.

> Add Metrics reporting for AuthenticationPlugin
> --
>
> Key: SOLR-12791
> URL: https://issues.apache.org/jira/browse/SOLR-12791
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, metrics
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Propose to add Metrics support for all Auth plugins. Will let abstract 
> {{AuthenticationPlugin}} base class implement {{SolrMetricProducer}} and keep 
> the counters, such as:
>  * requests
>  * req authenticated
>  * req pass-through (no credentials and blockUnknown false)
>  * req with auth failures due to wrong or malformed credentials
>  * req auth failures due to missing credentials
>  * errors
>  * timeouts
>  * timing stats
> Each implementation still needs to increment the counters etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12554) Expose IndexWriterConfig's RAMPerThreadHardLimitMB as SolrConfig.xml param

2018-07-17 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-12554:

Attachment: SOLR-12554.patch

> Expose IndexWriterConfig's RAMPerThreadHardLimitMB as SolrConfig.xml param
> --
>
> Key: SOLR-12554
> URL: https://issues.apache.org/jira/browse/SOLR-12554
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-12554.patch
>
>
> Currently, the RAMPerThreadHardLimitMB parameter of IWC is not exposed. This 
> is useful to control flush policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-12554) Expose IndexWriterConfig's RAMPerThreadHardLimitMB as SolrConfig.xml param

2018-07-16 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-12554:
---

 Summary: Expose IndexWriterConfig's RAMPerThreadHardLimitMB as 
SolrConfig.xml param
 Key: SOLR-12554
 URL: https://issues.apache.org/jira/browse/SOLR-12554
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya
Assignee: Ishan Chattopadhyaya


Currently, the RAMPerThreadHardLimitMB parameter of IWC is not exposed. This is 
useful to control flush policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Hello, Need help regarding the learning to rank

2018-07-12 Thread Ishan Chattopadhyaya

Check in solr-users mailing list, please.

On Tue, Jul 10, 2018 at 7:35 PM, Akshay Patil  wrote:

> Hi,
>
> Hi,
>
> I am student. for my master thesis I am working on the Learning To rank.
> As I did research on it. I found solution provided by the Bloomberg. But I
> would like to ask. With the example that you have provided It always shows
> the error of Bad Request.
>
> Do you have running example of it. So i can adapt it to my application.
>
> I am trying to use the example that you have provided in github.
>
> core :- techproducts
> traning_and_uploading_demo.py
>
> It generates the training data. But I am getting the problem in uploading
> the model. It shows error of bad request (empty request body). please help
> me out with this problem. So I will be able to adapt it to my application.
>
> Best Regards !
>
> Any help would be appreciated 
>
>
> Akshay
>

Re: [ANNOUNCE] Apache Lucene 6.6.5 released

2018-07-04 Thread Ishan Chattopadhyaya

Apologies for the confusion. Sloppy on my part. This release of Lucene
didn't contain any changes.

On Wed 4 Jul, 2018, 14:39 Rob Audenaerde,  wrote:

> "This release contains one bug fix."
>
> http://lucene.apache.org/core/6_6_5/changes/Changes.html
> No changes.
>
> Something does not add up?
>
> On Tue, Jul 3, 2018 at 11:29 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> 3 July 2018, Apache Lucene™ 6.6.5 available
>>
>> The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.5.
>>
>> Apache Lucene is a high-performance, full-featured text search engine
>> library written entirely in Java. It is a technology suitable for nearly
>> any application that requires full-text search, especially cross-platform.
>>
>> This release contains one bug fix. The release is available for immediate
>> download at:
>> http://lucene.apache.org/core/mirrors-core-latest-redir.html
>>
>> Further details of changes are available in the change log available at:
>> http://lucene.apache.org/core/6_6_5/changes/Changes.html
>>
>> Please report any feedback to the mailing lists (
>> http://lucene.apache.org/core/discussion.html)
>>
>> Note: The Apache Software Foundation uses an extensive mirroring network
>> for distributing releases. It is possible that the mirror you are using
>> may
>> not have replicated the release yet. If that is the case, please try
>> another mirror. This also applies to Maven access.
>>
>>

[ANNOUNCE] Apache Solr 6.6.5 released

2018-07-03 Thread Ishan Chattopadhyaya

03 July 2018, Apache Solr™ 6.6.5 available

The Lucene PMC is pleased to announce the release of Apache Solr 6.6.5

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

This release includes the following changes:

* Ability to disable configset upload via -Dconfigset.upload.enabled=false
 startup parameter
* Referal to external resources in various config files now disallowed

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.5

Please read CHANGES.txt for a detailed list of changes:

https://lucene.apache.org/solr/6_6_5/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.

[ANNOUNCE] Apache Lucene 6.6.5 released

2018-07-03 Thread Ishan Chattopadhyaya

03 July 2018, Apache Lucene™ 6.6.5 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.5.

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

This release contains no changes over 6.6.4. The release is available for
immediate
download at:
http://lucene.apache.org/core/mirrors-core-latest-redir.html

Further details of changes are available in the change log available at:
http://lucene.apache.org/core/6_6_5/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

[ANNOUNCE] Apache Lucene 6.6.5 released

2018-07-03 Thread Ishan Chattopadhyaya

3 July 2018, Apache Lucene™ 6.6.5 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.5.

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate
download at:
http://lucene.apache.org/core/mirrors-core-latest-redir.html

Further details of changes are available in the change log available at:
http://lucene.apache.org/core/6_6_5/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search

2018-06-29 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527334#comment-16527334
 ] 

Ishan Chattopadhyaya commented on LUCENE-7745:
--

Ah, I think I wasn't clear on my intentions behind those numbers.
 
bq. if it brings any performance - I doubt that, because the call overhead 
between Java and CUDA is way too high - in contrast to Postgres where all in 
plain C/C++
I wanted to start with those experiments just to prove to myself that there are 
no significant overheads or bottlenecks (as we've feared in the past) and that 
there can be clear benefits to be realized.

I wanted to try bulk scoring, and chose the distance calculation and sorting as 
an example because (1) it leverages two fields, (2) it was fairly isolated & 
easy to try out.

In practical usecases of spatial search, the spatial filtering doesn't require 
score calculation & sorting on  the entire dataset (just those documents that 
are in the vicinity of the user point, filtered down by the geohash or bkd tree 
node); so in some sense I was trying out an absolute worst case of Lucene 
spatial search. 

Now, that I'm convinced that this overall approach works and overheads are low, 
I can now move on to looking at Lucene internals, maybe starting with scoring 
in general (BooleanScorer, for example). Other parts of Lucene/Solr that might 
see benefit could be streaming expressions (since they seem computation heavy), 
LTR re-ranking etc.


Actually incorporating all these benefits into Lucene would require 
considerable effort, and we can open subsequent JIRAs once we've had a chance 
to explore them separately. Till then, I'm inclined to keep this issue as a 
kitchen sink for all-things-GPU, if that makes sense?

> Explore GPU acceleration for spatial search
> ---
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial-extras
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 11:07 AM:


Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores 
(which may leverage one or more indexed fields).
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=450!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=450!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned LUCENE-7745:


Assignee: Ishan Chattopadhyaya

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 10:51 AM:


Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=450!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 10:51 AM:


Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png!width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 10:50 AM:


Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png!width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!Screenshot from 2018-06-27 15-33-37.png!width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated LUCENE-7745:
-
Attachment: gpu-benchmarks.png

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated LUCENE-7745:
-
Attachment: (was: Screenshot from 2018-06-27 15-33-37.png)

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 10:40 AM:


Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!Screenshot from 2018-06-27 15-33-37.png!width=800!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!Screenshot from 2018-06-27 15-33-37.png!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: Screenshot from 2018-06-27 15-33-37.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated LUCENE-7745:
-
Attachment: Screenshot from 2018-06-27 15-33-37.png

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: Screenshot from 2018-06-27 15-33-37.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7745) Explore GPU acceleration

2018-06-27 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya commented on LUCENE-7745:
--

Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!Screenshot from 2018-06-27 15-33-37.png!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> 
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
>  Issue Type: Improvement
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 7.4.0 RC1

2018-06-20 Thread Ishan Chattopadhyaya

Given this is just a WARN, I don't think this is a release blocker. But, it
would be nice to fix for a better user experience.

On Wed, Jun 20, 2018 at 12:35 PM, Adrien Grand  wrote:

> Can someone comment on whether this is worth respining?
>
>
> Le mar. 19 juin 2018 à 16:57, Varun Thacker  a écrit :
>
>> I was testing the RC manually , and then when I went to create the
>> ".system" collection I noticed this in the logs:
>>
>> WARN  - 2018-06-19 14:53:00.400; [c:.system s:shard1 r:core_node2
>> x:.system_shard1_replica_n1] org.apache.solr.core.Config; You should not
>> use LATEST as luceneMatchVersion property: if you use this setting, and
>> then Solr upgrades to a newer release of Lucene, sizable changes may
>> happen. If precise back compatibility is important then you should instead
>> explicitly specify an actual Lucene version.
>>
>>
>> Anyone more familiar with the ".system"  collection and which config set
>> it picks ? The default config set has luceneMatchVersion = 7.4.0 . And how
>> important is this issue?
>>
>> On Tue, Jun 19, 2018 at 1:07 PM, Alan Woodward 
>> wrote:
>>
>>> +1
>>>
>>> SUCCESS! [1:57:22.136590]
>>>
>>>
>>> On 18 Jun 2018, at 21:27, Adrien Grand  wrote:
>>>
>>> Please vote for release candidate 1 for Lucene/Solr 7.4.0
>>>
>>> The artifacts can be downloaded from:
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.4.0-RC1-
>>> rev9060ac689c270b02143f375de0348b7f626adebc
>>>
>>> You can run the smoke tester directly with this command:
>>>
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.4.0-RC1-
>>> rev9060ac689c270b02143f375de0348b7f626adebc
>>>
>>>
>>> 
>>> Here’s my +1
>>> SUCCESS! [0:48:15.228535]
>>>
>>>
>>>
>>

Re: Welcome Nhat Nguyen as Lucene/Solr committer

2018-06-18 Thread Ishan Chattopadhyaya

Welcome Nhat. Excited to have you as a committer :-)

On Tue, Jun 19, 2018 at 5:06 AM, Đạt Cao Mạnh 
wrote:

> Welcome Nhat! Another Vietnamese guy!!
>
> On Tue, Jun 19, 2018 at 6:12 AM Karl Wright  wrote:
>
>> Welcome!!
>> Karl
>> On Mon, Jun 18, 2018 at 4:42 PM Adrien Grand  wrote:
>>
>>> Hi all,
>>>
>>> Please join me in welcoming Nhat Nguyen as the latest Lucene/Solr
>>> committer.
>>> Nhat, it's tradition for you to introduce yourself with a brief bio.
>>>
>>> Congratulations and Welcome!
>>>
>>> Adrien
>>>
>>

[jira] [Commented] (SOLR-12428) Adding LTR jar to _default configset

2018-06-18 Thread Ishan Chattopadhyaya (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515526#comment-16515526
 ] 

Ishan Chattopadhyaya commented on SOLR-12428:
-

{quote}Would the {{queryParser}}, {{cache}} and {{transformer}} elements be 
added too? Or would just the {{lib}} be present out-of-the-box leaving users to 
add the rest e.g. via the Config API (haven't tried that actually yet myself 
with LTR) using names of their choice?
{quote}
I'm inclined to leave it aside, and let the user be able to configure it 
through the Config API. The only thing the user is unable to do so far is to be 
able to add the jar, which is what I was hoping to do here.

> Adding LTR jar to _default configset
> 
>
> Key: SOLR-12428
> URL: https://issues.apache.org/jira/browse/SOLR-12428
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> Even though Solr comes out of the box with the LTR capabilities, it is not 
> possible to use them in an existing collection without hand editing the 
> solrconfig.xml to add the jar. Many other contrib jars are already present in 
> the _default configset's solrconfig.xml.
> I propose to add the ltr jar in the _default configset's solrconfig:
> {code}
>/>
> {code}
> Any thoughts, [~cpoerschke]?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-12428) Adding LTR jar to _default configset

2018-05-30 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-12428:
---

 Summary: Adding LTR jar to _default configset
 Key: SOLR-12428
 URL: https://issues.apache.org/jira/browse/SOLR-12428
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya
Assignee: Ishan Chattopadhyaya


Even though Solr comes out of the box with the LTR capabilities, it is not 
possible to use them in an existing collection without hand editing the 
solrconfig.xml to add the jar. Many other contrib jars are already present in 
the _default configset's solrconfig.xml.

I propose to add the ltr jar in the _default configset's solrconfig:
{code}
  
{code}

Any thoughts, [~cpoerschke]?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[ANNOUNCE] Apache Solr 6.6.4 released

2018-05-18 Thread Ishan Chattopadhyaya

18 May 2018, Apache Solr™ 6.6.4 available

The Lucene PMC is pleased to announce the release of Apache Solr 6.6.4

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

This release includes 1 bug fix since the 6.6.3 release:

* Do not allow to use absolute URIs for including other files in
solrconfig.xml and schema parsing

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.4

Please read CHANGES.txt for a detailed list of changes:

https://lucene.apache.org/solr/6_6_4/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.

[ANNOUNCE] Apache Solr 6.6.4 released

2018-05-18 Thread Ishan Chattopadhyaya

18 May 2018, Apache Solr™ 6.6.4 available

The Lucene PMC is pleased to announce the release of Apache Solr 6.6.4

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

This release includes 1 bug fix since the 6.6.3 release:

* Do not allow to use absolute URIs for including other files in
solrconfig.xml and schema parsing

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.4

Please read CHANGES.txt for a detailed list of changes:

https://lucene.apache.org/solr/6_6_4/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.

[ANNOUNCE] Apache Lucene 6.6.4 released

2018-05-18 Thread Ishan Chattopadhyaya

18 May 2018, Apache Lucene™ 6.6.4 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.4.

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate
download at:
http://lucene.apache.org/core/mirrors-core-latest-redir.html

Further details of changes are available in the change log available at:
http://lucene.apache.org/core/6_6_4/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

[ANNOUNCE] Apache Lucene 6.6.4 released

2018-05-18 Thread Ishan Chattopadhyaya

18 May 2018, Apache Lucene™ 6.6.4 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.4.

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate
download at:
http://lucene.apache.org/core/mirrors-core-latest-redir.html

Further details of changes are available in the change log available at:
http://lucene.apache.org/core/6_6_4/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

[jira] [Commented] (SOLR-12333) Redundant code in JSON response writer

2018-05-11 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472334#comment-16472334
 ] 

Ishan Chattopadhyaya commented on SOLR-12333:
-

+1.. Thanks [~dsmiley], [~mkhludnev] :-)

> Redundant code in JSON response writer 
> ---
>
> Key: SOLR-12333
> URL: https://issues.apache.org/jira/browse/SOLR-12333
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.4
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12333.patch
>
>
> https://issues.apache.org/jira/browse/SOLR-12096?focusedCommentId=16467537=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16467537



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2018-05-10 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470180#comment-16470180
 ] 

Ishan Chattopadhyaya commented on SOLR-10335:
-

[~shalinmangar], can you please port the changes to branch_6x?

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 5.5.5, 6.6.2, 7.1
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12338) Replay buffering tlog in parallel

2018-05-09 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469869#comment-16469869
 ] 

Ishan Chattopadhyaya commented on SOLR-12338:
-

There are some situations where if in-place updates and DBQs are re-ordered, 
then the entire document needs to be fetched from the leader. This is fine when 
we have an active leader, but in case of tlog replay, we would need to apply 
those updates in the same order.

I think if DBQs are executed in the right order (i.e. all updates before a DBQ 
was updated before the DBQ, and all updates after the DBQ are executed after 
the DBQ), then we can run the other updates in parallel.

Example:
{code:java}
add1
add2
add3
dbq1
add4
add5
add6
..
add20
dbq2
{code}
Here, add# are either full document updates or in-place updates. I suggest: we 
run updates add1-add3 in parallel, and then wait till they are done before 
executing db1, and then add4-add20 parallely and then wait and execute dbq2. 
This should be fine, I think. (CC [~hossman], wdyt?)

> Replay buffering tlog in parallel
> -
>
> Key: SOLR-12338
> URL: https://issues.apache.org/jira/browse/SOLR-12338
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-12338.patch
>
>
> Since updates with different id are independent, therefore it is safe to 
> replay them in parallel. This will significantly reduce recovering time of 
> replicas in high load indexing environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2018-05-09 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469794#comment-16469794
 ] 

Ishan Chattopadhyaya commented on SOLR-10335:
-

I just realized that this change was ported to branch_6_6, but not branch_6x. 
Is that fine?

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 5.5.5, 6.6.2, 7.1
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2018-04-19 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445350#comment-16445350
 ] 

Ishan Chattopadhyaya commented on SOLR-4793:


I think the long term solution could be to implement something like a 
BlobStoreResourceLoader, and a configset (as a whole or in parts) could be 
loaded from ZK or blob store.

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>Priority: Major
> Attachments: SOLR-4793.patch
>
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-12239) Enabling index sorting causes "segment not sorted with indexSort=null"

2018-04-18 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-12239:
---

 Summary: Enabling index sorting causes "segment not sorted with 
indexSort=null"
 Key: SOLR-12239
 URL: https://issues.apache.org/jira/browse/SOLR-12239
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.1
Reporter: Ishan Chattopadhyaya


When index sorting is enabled on an existing collection/index (using 
SortingMergePolicy), the collection reload causes the following exception:

{code}
java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: 
Unable to create core [mycoll_shard1_replica_n1]
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:671)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Unable to create core 
[mycoll_shard1_replica_n1]
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1045)
at 
org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:642)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
... 5 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.(SolrCore.java:989)
at org.apache.solr.core.SolrCore.(SolrCore.java:844)
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1029)
... 7 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2076)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2196)
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1072)
at org.apache.solr.core.SolrCore.(SolrCore.java:961)
... 9 more
Caused by: org.apache.lucene.index.CorruptIndexException: segment not sorted 
with indexSort=null (resource=_0(7.1.0):C1)
at 
org.apache.lucene.index.IndexWriter.validateIndexSort(IndexWriter.java:1185)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1108)
at 
org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:119)
at 
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:94)
at 
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:257)
at 
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:131)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2037)
... 12 more
{code}

This means that the user actually needs to delete the index segments, reload 
the collection and then re-index. This is bad user experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12230) Deprecate SortingMergePolicy

2018-04-17 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-12230:

Attachment: SOLR-12230.patch

WIP patch for this. Need to add tests and documentation changes.

> Deprecate SortingMergePolicy
> 
>
> Key: SOLR-12230
> URL: https://issues.apache.org/jira/browse/SOLR-12230
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-12230.patch
>
>
> The SortingMergePolicy should be deprecated since first class support is now 
> available (LUCENE-6766). The indexSort configuration can be accepted via the 
> solrconfig's indexConfig section directly, and SMP can throw a deprecation 
> warning through the 7x versions of Solr.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-12230) Deprecate SortingMergePolicy

2018-04-17 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-12230:
---

 Summary: Deprecate SortingMergePolicy
 Key: SOLR-12230
 URL: https://issues.apache.org/jira/browse/SOLR-12230
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya


The SortingMergePolicy should be deprecated since first class support is now 
available (LUCENE-6766). The indexSort configuration can be accepted via the 
solrconfig's indexConfig section directly, and SMP can throw a deprecation 
warning through the 7x versions of Solr.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-11920) Differential file copy for IndexFetcher

2018-04-17 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-11920.
-
   Resolution: Fixed
Fix Version/s: 7.4

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-11920.patch, SOLR-11920.patch, 
> thetaphi.Lucene.Solr.7.x.Linux.1675.log.txt.gz
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11920) Differential file copy for IndexFetcher

2018-04-17 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440445#comment-16440445
 ] 

Ishan Chattopadhyaya commented on SOLR-11920:
-

This has not yet been released; this is slated for 7.4 release.

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch, 
> thetaphi.Lucene.Solr.7.x.Linux.1675.log.txt.gz
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11920) Differential file copy for IndexFetcher

2018-04-16 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440401#comment-16440401
 ] 

Ishan Chattopadhyaya commented on SOLR-11920:
-

I've pushed a fix for these failures (they were genuine code issues, not just 
test issues). I'll keep an eye out for further failures related to this and 
closed once I see none. Thanks [~steve_rowe].

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch, 
> thetaphi.Lucene.Solr.7.x.Linux.1675.log.txt.gz
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: BugFix release 7.3.1

2018-04-16 Thread Ishan Chattopadhyaya

+1

On Mon, Apr 16, 2018 at 9:51 PM, Varun Thacker  wrote:

> +1
>
> On Mon, Apr 16, 2018 at 7:39 AM, Adrien Grand  wrote:
>
>> +1
>>
>> Le lun. 16 avr. 2018 à 16:22, Alan Woodward  a
>> écrit :
>>
>>> +1
>>>
>>> I’d like to get LUCENE-8254 in as well.
>>>
>>>
>>> On 16 Apr 2018, at 15:15, Đạt Cao Mạnh  wrote:
>>>
>>> Note: Resent another mail because the previous one missed 7.3.1 in the
>>> title.
>>> Hi,
>>>
>>> There are some bugs get fixed in 7.4 are SOLR-12066, SOLR-12087, and
>>> SOLR-12088 which can cause some annoying problems to users who are using
>>> autoscaling framework (or frequently calling DeleteReplica API).
>>> so I wanted to ask if anyone objects to a bugfix release for 7.3
>>> (7.3.1). I also volunteer to be the release manager for this one if it is
>>> accepted.
>>>
>>>
>>>
>>>
>>>
>

[jira] [Commented] (SOLR-11920) Differential file copy for IndexFetcher

2018-04-12 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436797#comment-16436797
 ] 

Ishan Chattopadhyaya commented on SOLR-11920:
-

Thanks for reporting, Steve. I'll look into this.

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch, 
> thetaphi.Lucene.Solr.7.x.Linux.1675.log.txt.gz
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12216) Add support for cross-cloud join

2018-04-12 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-12216:

Priority: Major  (was: Trivial)

> Add support for cross-cloud join 
> -
>
> Key: SOLR-12216
> URL: https://issues.apache.org/jira/browse/SOLR-12216
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Horatiu Lazu
>Priority: Major
>
> This patch is to propose the idea of extending the capabilities of the 
> built-in join to allow joining across SolrClouds. Similar to streaming's 
> search function, the user can directly specify the zkHost of the other 
> SolrCloud and the rest of the syntax (from, to, fromIndex) can remain the 
> same. This join would be triggered when the zkHost parameter is specified, 
> containing the address of the other SolrCluster. It could also be packaged as 
> a separate plugin.
>  
> In my testing, my current implementation is on average 4.5x faster than an 
> equivalent streaming expression intersecting from two search queries, one of 
> which streams from another collection on another SolrCloud. 
> h5. How it works
> Similar to the existing join, I created a QParser, but this join works as a 
> post-filter. The join first populates a hash set containing fields from the 
> “from” index (i.e, the index that’s not the one we’re running the query 
> from). To obtain the fields, it establishes a connection with the other 
> SolrCloud using SolrJ through the ZooKeeper address specified, and then uses 
> a custom request handler that performs the query on the “from” index and 
> return back an array of strings containing a list of fields. Then, on the 
> “to” index, it iterates through the array sent as JavaBin and adds it to the 
> hash set. After that, we iterate through the NumericDocList for the “to” 
> core’s join field, and if there’s a value within the NumericDocList that’s 
> found within our hash set, we collect it inside the DelegatingCollector.
> This allows for joining across sharded collections as well. 
> h5. How I benchmarked
> I created web-app that first reloads the collections, then sends 25 AJAX 
> requests at once to the Solr endpoint of varying query sizes (between 127 
> search results and 690,000), and then recorded the results. After all 
> responses are returned, the collection is reloaded, and the equivalent 
> streaming expressions are tested. This process is repeated 15 times, and the 
> average of the results is taken. 
> Note: The first two requests are not counted in the statistics, because it 
> “warms up” the collection. For reference, after bouncing Solr and at least 
> one query is executed, it takes on average ~890ms for joining on two 
> collections with about 690,000 results, while it takes ~4.5 seconds using 
> streaming expressions).
>  
> I have written unit tests written as well. I would appreciate some comments 
> on this. Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12216) Add support for cross-cloud join

2018-04-12 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436142#comment-16436142
 ] 

Ishan Chattopadhyaya commented on SOLR-12216:
-

Where is the patch that you've referred to in the description?

> Add support for cross-cloud join 
> -
>
> Key: SOLR-12216
> URL: https://issues.apache.org/jira/browse/SOLR-12216
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Horatiu Lazu
>Priority: Trivial
>
> This patch is to propose the idea of extending the capabilities of the 
> built-in join to allow joining across SolrClouds. Similar to streaming's 
> search function, the user can directly specify the zkHost of the other 
> SolrCloud and the rest of the syntax (from, to, fromIndex) can remain the 
> same. This join would be triggered when the zkHost parameter is specified, 
> containing the address of the other SolrCluster. It could also be packaged as 
> a separate plugin.
>  
> In my testing, my current implementation is on average 4.5x faster than an 
> equivalent streaming expression intersecting from two search queries, one of 
> which streams from another collection on another SolrCloud. 
> h5. How it works
> Similar to the existing join, I created a QParser, but this join works as a 
> post-filter. The join first populates a hash set containing fields from the 
> “from” index (i.e, the index that’s not the one we’re running the query 
> from). To obtain the fields, it establishes a connection with the other 
> SolrCloud using SolrJ through the ZooKeeper address specified, and then uses 
> a custom request handler that performs the query on the “from” index and 
> return back an array of strings containing a list of fields. Then, on the 
> “to” index, it iterates through the array sent as JavaBin and adds it to the 
> hash set. After that, we iterate through the NumericDocList for the “to” 
> core’s join field, and if there’s a value within the NumericDocList that’s 
> found within our hash set, we collect it inside the DelegatingCollector.
> This allows for joining across sharded collections as well. 
> h5. How I benchmarked
> I created web-app that first reloads the collections, then sends 25 AJAX 
> requests at once to the Solr endpoint of varying query sizes (between 127 
> search results and 690,000), and then recorded the results. After all 
> responses are returned, the collection is reloaded, and the equivalent 
> streaming expressions are tested. This process is repeated 15 times, and the 
> average of the results is taken. 
> Note: The first two requests are not counted in the statistics, because it 
> “warms up” the collection. For reference, after bouncing Solr and at least 
> one query is executed, it takes on average ~890ms for joining on two 
> collections with about 690,000 results, while it takes ~4.5 seconds using 
> streaming expressions).
>  
> I have written unit tests written as well. I would appreciate some comments 
> on this. Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-12096) Inconsistent response format in subquery transform

2018-04-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-12096.
-
Resolution: Fixed

> Inconsistent response format in subquery transform
> --
>
> Key: SOLR-12096
> URL: https://issues.apache.org/jira/browse/SOLR-12096
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.patch, 
> SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.testsubquery.patch
>
>
> Solr version - 6.6.2
> The response of subquery transform is inconsistent with multi-shard compared 
> to single-shard
> h1. Single Shard collection
> Request 
> {code:java}
> localhost:8983/solr/k_test/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response for above request
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 0,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
> ],
> "origQuery": "false",
> "q.op": "AND",
> "_children_.rows": "3",
> "sort": "score desc,uniqueId desc",
> "rows": "1",
> "q": "{!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})",
> "qf": "parent_field",
> "spellcheck": "false",
> "_children_.q": "{!edismax qf=parentId v=$row.uniqueId}",
> "_children_.fl": [
> "uniqueId",
> "score"
> ],
> "wt": "json",
> "facet": "false"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "maxScore": 0.5,
> "docs": [
> {
> "uniqueId": "10001677",
> "score": 0.5,
> "_children_": {
> "numFound": 9,
> "start": 0,
> "docs": [
> {
> "uniqueId": "100016771",
> "score": 0.5
> },
> {
> "uniqueId": "100016772",
> "score": 0.5
> },
> {
> "uniqueId": "100016773",
> "score": 0.5
> }
> ]
> }
> }
> ]
> }
> }
> {code}
> Here, *_children_* suquery response is as expected (Based on documentation)
> h1. Multi Shard collection(2)
> Request
> {code:java}
> localhost:8983/solr/k_test_2/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 11,
> "params": {
> "fl": [
> "uniqueId",
> "score",
>

Re: lucene-solr:branch_7x: SOLR-12096: Fixed inconsistent results format of subquery transformer for distributed search (multi-shard)

2018-04-10 Thread Ishan Chattopadhyaya

Indeed, I had. Sorry for the inconvenience. :-)

On Mon, Apr 9, 2018 at 5:10 PM, Alan Woodward <romseyg...@gmail.com> wrote:

> Hey Ishan, I think you inadvertently committed the patch file as well?
>
> > On 9 Apr 2018, at 12:07, is...@apache.org wrote:
> >
> > Repository: lucene-solr
> > Updated Branches:
> > refs/heads/branch_7x 918ecb84c -> 91bd3e1f1
> >
> >
> > SOLR-12096: Fixed inconsistent results format of subquery transformer
> for distributed search (multi-shard)
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/
> 91bd3e1f
> > Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/91bd3e1f
> > Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/91bd3e1f
> >
> > Branch: refs/heads/branch_7x
> > Commit: 91bd3e1f1febf1d0953186be8cbc9b4e2146e579
> > Parents: 918ecb8
> > Author: Ishan Chattopadhyaya <is...@apache.org>
> > Authored: Mon Apr 9 16:36:07 2018 +0530
> > Committer: Ishan Chattopadhyaya <is...@apache.org>
> > Committed: Mon Apr 9 16:36:49 2018 +0530
> >
> > --
> > SOLR-12096.patch| 217 +++
> > solr/CHANGES.txt|   3 +
> > .../solr/response/GeoJSONResponseWriter.java|   3 +-
> > .../solr/response/JSONResponseWriter.java   |   6 +-
> > .../apache/solr/response/JSONWriterTest.java|  24 +-
> > .../TestSubQueryTransformerDistrib.java |  59 +++--
> > 6 files changed, 289 insertions(+), 23 deletions(-)
> > --
> >
> >
> > http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/
> 91bd3e1f/SOLR-12096.patch
> > --
> > diff --git a/SOLR-12096.patch b/SOLR-12096.patch
> > new file mode 100644
> > index 000..9ed1ad7
> > --- /dev/null
> > +++ b/SOLR-12096.patch
> > @@ -0,0 +1,217 @@
> > +diff --git 
> > a/solr/core/src/java/org/apache/solr/response/GeoJSONResponseWriter.java
> b/solr/core/src/java/org/apache/solr/response/GeoJSONResponseWriter.java
> > +index 43fd7b4..012290e 100644
> > +--- a/solr/core/src/java/org/apache/solr/response/
> GeoJSONResponseWriter.java
> >  b/solr/core/src/java/org/apache/solr/response/
> GeoJSONResponseWriter.java
> > +@@ -166,7 +166,8 @@ class GeoJSONWriter extends JSONWriter {
> > +
> > +   // SolrDocument will now have multiValued fields represented as
> a Collection,
> > +   // even if only a single value is returned for this document.
> > +-  if (val instanceof List) {
> > ++  // For SolrDocumentList, use writeVal instead of writeArray
> > ++  if (!(val instanceof SolrDocumentList) && val instanceof List) {
> > + // shortcut this common case instead of going through writeVal
> again
> > + writeArray(name,((Iterable)val).iterator());
> > +   } else {
> > +diff --git 
> > a/solr/core/src/java/org/apache/solr/response/JSONResponseWriter.java
> b/solr/core/src/java/org/apache/solr/response/JSONResponseWriter.java
> > +index 513df4e..5f6e2f2 100644
> > +--- a/solr/core/src/java/org/apache/solr/response/
> JSONResponseWriter.java
> >  b/solr/core/src/java/org/apache/solr/response/
> JSONResponseWriter.java
> > +@@ -25,10 +25,11 @@ import java.util.Map;
> > + import java.util.Set;
> > +
> > + import org.apache.solr.common.IteratorWriter;
> > ++import org.apache.solr.common.MapWriter;
> > + import org.apache.solr.common.MapWriter.EntryWriter;
> > + import org.apache.solr.common.PushWriter;
> > + import org.apache.solr.common.SolrDocument;
> > +-import org.apache.solr.common.MapWriter;
> > ++import org.apache.solr.common.SolrDocumentList;
> > + import org.apache.solr.common.params.SolrParams;
> > + import org.apache.solr.common.util.NamedList;
> > + import org.apache.solr.common.util.SimpleOrderedMap;
> > +@@ -367,7 +368,8 @@ class JSONWriter extends TextResponseWriter {
> > +
> > +   // SolrDocument will now have multiValued fields represented as
> a Collection,
> > +   // even if only a single value is returned for this document.
> > +-  if (val instanceof List) {
> > ++  // For SolrDocumentList, use writeVal instead of writeArray
> > ++  if (!(val instanceof SolrDocumentList) && val instanceof List) {
> > + // shortcu

[jira] [Assigned] (SOLR-12188) Inconsistent behavior with CREATE collection API

2018-04-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-12188:
---

Assignee: Ishan Chattopadhyaya

> Inconsistent behavior with CREATE collection API
> 
>
> Key: SOLR-12188
> URL: https://issues.apache.org/jira/browse/SOLR-12188
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, config-api
>Affects Versions: 7.2
>Reporter: Munendra S N
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-12188.patch
>
>
> If collection.configName is not specified during create collection then 
> _default configSet is used to create mutable configSet (with suffix 
> AUTOCREATED)
> * In the Admin UI, it is mandatory to specify configSet. This behavior is 
> inconsistent with CREATE collection API(where it is not mandatory)
> * Both in Admin UI and CREATE API, when _default is specified as configSet 
> then no mutable configSet is created. So, changes in one collection would 
> reflect in other



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12096) Inconsistent response format in subquery transform

2018-04-10 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431819#comment-16431819
 ] 

Ishan Chattopadhyaya commented on SOLR-12096:
-

Committed a fix; collaborated with [~munendrasn] offline on this fix.

> Inconsistent response format in subquery transform
> --
>
> Key: SOLR-12096
> URL: https://issues.apache.org/jira/browse/SOLR-12096
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.patch, 
> SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.testsubquery.patch
>
>
> Solr version - 6.6.2
> The response of subquery transform is inconsistent with multi-shard compared 
> to single-shard
> h1. Single Shard collection
> Request 
> {code:java}
> localhost:8983/solr/k_test/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response for above request
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 0,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
> ],
> "origQuery": "false",
> "q.op": "AND",
> "_children_.rows": "3",
> "sort": "score desc,uniqueId desc",
> "rows": "1",
> "q": "{!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})",
> "qf": "parent_field",
> "spellcheck": "false",
> "_children_.q": "{!edismax qf=parentId v=$row.uniqueId}",
> "_children_.fl": [
> "uniqueId",
> "score"
> ],
> "wt": "json",
> "facet": "false"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "maxScore": 0.5,
> "docs": [
> {
> "uniqueId": "10001677",
> "score": 0.5,
> "_children_": {
> "numFound": 9,
> "start": 0,
> "docs": [
> {
> "uniqueId": "100016771",
> "score": 0.5
> },
> {
> "uniqueId": "100016772",
> "score": 0.5
> },
> {
> "uniqueId": "100016773",
> "score": 0.5
> }
> ]
> }
> }
> ]
> }
> }
> {code}
> Here, *_children_* suquery response is as expected (Based on documentation)
> h1. Multi Shard collection(2)
> Request
> {code:java}
> localhost:8983/solr/k_test_2/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 11,
> "params": {
>

[jira] [Reopened] (SOLR-12096) Inconsistent response format in subquery transform

2018-04-09 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reopened SOLR-12096:
-

Reopening. There are intermittent test failures: 
https://builds.apache.org/job/Lucene-Solr-Tests-7.x/556/testReport/junit/org.apache.solr.response.transform/TestSubQueryTransformerDistrib/test/

> Inconsistent response format in subquery transform
> --
>
> Key: SOLR-12096
> URL: https://issues.apache.org/jira/browse/SOLR-12096
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.patch, 
> SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.testsubquery.patch
>
>
> Solr version - 6.6.2
> The response of subquery transform is inconsistent with multi-shard compared 
> to single-shard
> h1. Single Shard collection
> Request 
> {code:java}
> localhost:8983/solr/k_test/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response for above request
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 0,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
> ],
> "origQuery": "false",
> "q.op": "AND",
> "_children_.rows": "3",
> "sort": "score desc,uniqueId desc",
> "rows": "1",
> "q": "{!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})",
> "qf": "parent_field",
> "spellcheck": "false",
> "_children_.q": "{!edismax qf=parentId v=$row.uniqueId}",
> "_children_.fl": [
> "uniqueId",
> "score"
> ],
> "wt": "json",
> "facet": "false"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "maxScore": 0.5,
> "docs": [
> {
> "uniqueId": "10001677",
> "score": 0.5,
> "_children_": {
> "numFound": 9,
> "start": 0,
> "docs": [
> {
> "uniqueId": "100016771",
> "score": 0.5
> },
> {
> "uniqueId": "100016772",
> "score": 0.5
> },
> {
> "uniqueId": "100016773",
> "score": 0.5
> }
> ]
> }
> }
> ]
> }
> }
> {code}
> Here, *_children_* suquery response is as expected (Based on documentation)
> h1. Multi Shard collection(2)
> Request
> {code:java}
> localhost:8983/solr/k_test_2/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 11,
>

[jira] [Commented] (SOLR-12096) Inconsistent response format in subquery transform

2018-03-23 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412364#comment-16412364
 ] 

Ishan Chattopadhyaya commented on SOLR-12096:
-

Good catch! In the added unit test, instead of hardcoding the entire expected 
response, it should be better if we could do the validation using the parsed 
JSON. There's {{assertJQ}} that can be used for the purpose (have a look at its 
occurences, e.g. TestInPlaceUpdatesStandalone etc.). The problem with the 
hardcoding of the expected response is that change in whitespaces (if, say, we 
change the underlying JSON library) would falsely fail this test.

> Inconsistent response format in subquery transform
> --
>
> Key: SOLR-12096
> URL: https://issues.apache.org/jira/browse/SOLR-12096
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.patch
>
>
> Solr version - 6.6.2
> The response of subquery transform is inconsistent with multi-shard compared 
> to single-shard
> h1. Single Shard collection
> Request 
> {code:java}
> localhost:8983/solr/k_test/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response for above request
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 0,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
> ],
> "origQuery": "false",
> "q.op": "AND",
> "_children_.rows": "3",
> "sort": "score desc,uniqueId desc",
> "rows": "1",
> "q": "{!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})",
> "qf": "parent_field",
> "spellcheck": "false",
> "_children_.q": "{!edismax qf=parentId v=$row.uniqueId}",
> "_children_.fl": [
> "uniqueId",
> "score"
> ],
> "wt": "json",
> "facet": "false"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "maxScore": 0.5,
> "docs": [
> {
> "uniqueId": "10001677",
> "score": 0.5,
> "_children_": {
> "numFound": 9,
> "start": 0,
> "docs": [
> {
> "uniqueId": "100016771",
> "score": 0.5
> },
> {
> "uniqueId": "100016772",
> "score": 0.5
> },
> {
> "uniqueId": "100016773",
> "score": 0.5
> }
> ]
> }
> }
> ]
> }
> }
> {code}
> Here, *_children_* suquery response is as expected (Based on documentation)
> h1. Multi Shard collection(2)
> Request
> {code:java}
> localhost:8983/solr/k_test_2/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
&

[jira] [Assigned] (SOLR-12096) Inconsistent response format in subquery transform

2018-03-23 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-12096:
---

Assignee: Ishan Chattopadhyaya

> Inconsistent response format in subquery transform
> --
>
> Key: SOLR-12096
> URL: https://issues.apache.org/jira/browse/SOLR-12096
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-12096.patch, SOLR-12096.patch, SOLR-12096.patch
>
>
> Solr version - 6.6.2
> The response of subquery transform is inconsistent with multi-shard compared 
> to single-shard
> h1. Single Shard collection
> Request 
> {code:java}
> localhost:8983/solr/k_test/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response for above request
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 0,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
> ],
> "origQuery": "false",
> "q.op": "AND",
> "_children_.rows": "3",
> "sort": "score desc,uniqueId desc",
> "rows": "1",
> "q": "{!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})",
> "qf": "parent_field",
> "spellcheck": "false",
> "_children_.q": "{!edismax qf=parentId v=$row.uniqueId}",
> "_children_.fl": [
> "uniqueId",
> "score"
> ],
> "wt": "json",
> "facet": "false"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "maxScore": 0.5,
> "docs": [
> {
> "uniqueId": "10001677",
> "score": 0.5,
> "_children_": {
> "numFound": 9,
> "start": 0,
> "docs": [
> {
> "uniqueId": "100016771",
> "score": 0.5
> },
> {
> "uniqueId": "100016772",
> "score": 0.5
> },
> {
> "uniqueId": "100016773",
> "score": 0.5
> }
> ]
> }
> }
> ]
> }
> }
> {code}
> Here, *_children_* suquery response is as expected (Based on documentation)
> h1. Multi Shard collection(2)
> Request
> {code:java}
> localhost:8983/solr/k_test_2/search?sort=score desc,uniqueId 
> desc=AND=json={!parent which=parent_field:true score=max}({!edismax 
> v=$origQuery})=false=uniqueId=score=_children_:[subquery]=uniqueId=false=parent_field&_children_.fl=uniqueId&_children_.fl=score&_children_.rows=3=false&_children_.q={!edismax
>  qf=parentId v=$row.uniqueId}=1
> {code}
> Response
> {code:json}
> {
> "responseHeader": {
> "zkConnected": true,
> "status": 0,
> "QTime": 11,
> "params": {
> "fl": [
> "uniqueId",
> "score",
> "_children_:[subquery]",
> "uniqueId"
>

[jira] [Resolved] (SOLR-11920) Differential file copy for IndexFetcher

2018-03-17 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-11920.
-
Resolution: Fixed

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11920) Differential file copy for IndexFetcher

2018-02-26 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376649#comment-16376649
 ] 

Ishan Chattopadhyaya commented on SOLR-11920:
-

I just tested this patch on Windows. The TestReplicationHandler was failing for 
me intermittently on Windows (without this patch), and the same behaviour 
continued after the patch. I couldn't think from the failure logs that this 
patch adversely affected anything there. The TestStressRecovery passed without 
any issues.

I'm planning to commit this soon. If someone has time to review this, please 
let me know. :-)

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-02-24 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-8327.

   Resolution: Fixed
Fix Version/s: 7.3
   master (8.0)

Thanks [~slackhappy]!

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: solrcloud
> Fix For: master (8.0), 7.3
>
> Attachments: SOLR-8327.patch, SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene/Solr 7.3

2018-02-23 Thread Ishan Chattopadhyaya

+1.. That gives us enough time to get WIP stuff in.

On Fri, Feb 23, 2018 at 3:20 PM, Alan Woodward <
alan.woodw...@romseysoftware.co.uk> wrote:

> Hi all,
>
> It’s been a couple of months since the 7.2 release, and we’ve accumulated
> some nice new features since then.  I’d like to volunteer to be RM for a
> 7.3 release.
>
> I’m travelling for the next couple of weeks, so I would plan to create the
> release branch two weeks today, on the 9th March (unless anybody else wants
> to do it sooner, of course :)
>
> - Alan
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Resolved] (SOLR-10079) TestInPlaceUpdates(Distrib|Standalone) failures

2018-02-21 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-10079.
-
Resolution: Fixed

> TestInPlaceUpdates(Distrib|Standalone) failures
> ---
>
> Key: SOLR-10079
> URL: https://issues.apache.org/jira/browse/SOLR-10079
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 6.7, 7.0
>
> Attachments: SOLR-10079.patch, SOLR-10079.patch, stdout, 
> tests-failures.txt
>
>
> From [https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18881/], 
> reproduces for me:
> {noformat}
> Checking out Revision d8d61ff61d1d798f5e3853ef66bc485d0d403f18 
> (refs/remotes/origin/master)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
> -Dtests.seed=E1BB56269B8215B0 -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=sr-Latn-RS -Dtests.timezone=America/Grand_Turk 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 77.7s J2 | TestInPlaceUpdatesDistrib.test <<<
>[junit4]> Throwable #1: java.lang.AssertionError: Earlier: [79, 79, 
> 79], now: [78, 78, 78]
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E1BB56269B8215B0:69EF69FC357E7848]:0)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.ensureRtgWorksWithPartialUpdatesTest(TestInPlaceUpdatesDistrib.java:425)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:142)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>  at 
> java.base/java.lang.reflect.Method.invoke(Method.java:543)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
>[junit4]>  at java.base/java.lang.Thread.run(Thread.java:844)
> [...]
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {id_i=PostingsFormat(name=LuceneFixedGap), title_s=FSTOrd50, 
> id=PostingsFormat(name=Asserting), 
> id_field_copy_that_does_not_support_in_place_update_s=FSTOrd50}, 
> docValues:{inplace_updatable_float=DocValuesFormat(name=Asserting), 
> id_i=DocValuesFormat(name=Direct), _version_=DocValuesFormat(name=Asserting), 
> title_s=DocValuesFormat(name=Lucene70), id=DocValuesFormat(name=Lucene70), 
> id_field_copy_that_does_not_support_in_place_update_s=DocValuesFormat(name=Lucene70),
>  inplace_updatable_int_with_default=DocValuesFormat(name=Asserting), 
> inplace_updatable_int=DocValuesFormat(name=Direct), 
> inplace_updatable_float_with_default=DocValuesFormat(name=Direct)}, 
> maxPointsInLeafNode=1342, maxMBSortInHeap=6.368734895089348, 
> sim=RandomSimilarity(queryNorm=true): {}, locale=sr-Latn-RS, 
> timezone=America/Grand_Turk
>[junit4]   2> NOTE: Linux 4.4.0-53-generic i386/Oracle Corporation 9-ea 
> (32-bit)/cpus=12,threads=1,free=107734480,total=518979584
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10079) TestInPlaceUpdates(Distrib|Standalone) failures

2018-02-21 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371969#comment-16371969
 ] 

Ishan Chattopadhyaya commented on SOLR-10079:
-

This hasn't failed in a while. The above seed doesn't reproduce for me anymore:
{code}
ant test  -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
-Dtests.seed=F02AC8BA5333D665 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=pt -Dtests.timezone=America/Indianapolis -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
{code}

> TestInPlaceUpdates(Distrib|Standalone) failures
> ---
>
> Key: SOLR-10079
> URL: https://issues.apache.org/jira/browse/SOLR-10079
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 6.7, 7.0
>
> Attachments: SOLR-10079.patch, SOLR-10079.patch, stdout, 
> tests-failures.txt
>
>
> From [https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18881/], 
> reproduces for me:
> {noformat}
> Checking out Revision d8d61ff61d1d798f5e3853ef66bc485d0d403f18 
> (refs/remotes/origin/master)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
> -Dtests.seed=E1BB56269B8215B0 -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=sr-Latn-RS -Dtests.timezone=America/Grand_Turk 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 77.7s J2 | TestInPlaceUpdatesDistrib.test <<<
>[junit4]> Throwable #1: java.lang.AssertionError: Earlier: [79, 79, 
> 79], now: [78, 78, 78]
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E1BB56269B8215B0:69EF69FC357E7848]:0)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.ensureRtgWorksWithPartialUpdatesTest(TestInPlaceUpdatesDistrib.java:425)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:142)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>  at 
> java.base/java.lang.reflect.Method.invoke(Method.java:543)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
>[junit4]>  at java.base/java.lang.Thread.run(Thread.java:844)
> [...]
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {id_i=PostingsFormat(name=LuceneFixedGap), title_s=FSTOrd50, 
> id=PostingsFormat(name=Asserting), 
> id_field_copy_that_does_not_support_in_place_update_s=FSTOrd50}, 
> docValues:{inplace_updatable_float=DocValuesFormat(name=Asserting), 
> id_i=DocValuesFormat(name=Direct), _version_=DocValuesFormat(name=Asserting), 
> title_s=DocValuesFormat(name=Lucene70), id=DocValuesFormat(name=Lucene70), 
> id_field_copy_that_does_not_support_in_place_update_s=DocValuesFormat(name=Lucene70),
>  inplace_updatable_int_with_default=DocValuesFormat(name=Asserting), 
> inplace_updatable_int=DocValuesFormat(name=Direct), 
> inplace_updatable_float_with_default=DocValuesFormat(name=Direct)}, 
> maxPointsInLeafNode=1342, maxMBSortInHeap=6.368734895089348, 
> sim=RandomSimilarity(queryNorm=true): {}, locale=sr-Latn-RS, 
> timezone=America/Grand_Turk
>[junit4]   2> NOTE: Linux 4.4.0-53-generic i386/Oracle Corporation 9-ea 
> (32-bit)/cpus=12,threads=1,free=107734480,total=518979584
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-21 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-10261:

Fix Version/s: 7.3

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.3
>
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-02-13 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362235#comment-16362235
 ] 

Ishan Chattopadhyaya commented on SOLR-8327:


Updated John's patch to use a lookup for version before fetching. 
[~slackhappy], [~noble.paul], can you please review?

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: solrcloud
> Attachments: SOLR-8327.patch, SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-02-13 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-8327:
---
Attachment: SOLR-8327.patch

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: solrcloud
> Attachments: SOLR-8327.patch, SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-10261.
-
Resolution: Fixed

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359486#comment-16359486
 ] 

Ishan Chattopadhyaya commented on SOLR-10261:
-

Nightly test is considerably slower now (only for those rare affected seeds), 
but non-nightly test is quick. In both cases, the test now passes (earlier 
didn't pass) and this feels like the correct fix.

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-11965) collapse.field on an unsupported field should throw 400 bad request

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-11965.
-
Resolution: Duplicate

> collapse.field on an unsupported field should throw 400 bad request
> ---
>
> Key: SOLR-11965
> URL: https://issues.apache.org/jira/browse/SOLR-11965
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.7, 7.2
>Reporter: ananthesh
>Priority: Major
>  Labels: collapse, collapsingQParserPlugin
>
> Currently w.r.t Collapsing Query 
> Parser([https://lucene.apache.org/solr/guide/7_2/collapse-and-expand-results.html#collapsing-query-parser),]
>   if an unsupported field or unknown field passed as a parameter to 
> 'collapse.field' property, The system returns HTTP status code:500, even 
> though the error msg is accurate. 
> {code:java}
> curl 
> "solr:8983/solr/core-name/select?q=*=%7B%21collapse+field%3Dunknown-field%7D"
> {code}
> {code:javascript}
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":500,
> "QTime":1,
> "params":{
>   "q":"*",
>   "fq":"{!collapse field=unknown-field}"}},
>   "error":{
> "msg":"org.apache.solr.common.SolrException: undefined field: 
> \"unknown-field\"",
> "trace":"java.lang.RuntimeException: 
> org.apache.solr.common.SolrException: undefined field: \"unknown-field\""}}
> {code}
>  
> On an unknown field, the system needs to return HTTP status code:400



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-11965) collapse.field on an unsupported field should throw 400 bad request

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reopened SOLR-11965:
-

Resolving it as duplicate instead of fixed.

> collapse.field on an unsupported field should throw 400 bad request
> ---
>
> Key: SOLR-11965
> URL: https://issues.apache.org/jira/browse/SOLR-11965
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.7, 7.2
>Reporter: ananthesh
>Priority: Major
>  Labels: collapse, collapsingQParserPlugin
>
> Currently w.r.t Collapsing Query 
> Parser([https://lucene.apache.org/solr/guide/7_2/collapse-and-expand-results.html#collapsing-query-parser),]
>   if an unsupported field or unknown field passed as a parameter to 
> 'collapse.field' property, The system returns HTTP status code:500, even 
> though the error msg is accurate. 
> {code:java}
> curl 
> "solr:8983/solr/core-name/select?q=*=%7B%21collapse+field%3Dunknown-field%7D"
> {code}
> {code:javascript}
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":500,
> "QTime":1,
> "params":{
>   "q":"*",
>   "fq":"{!collapse field=unknown-field}"}},
>   "error":{
> "msg":"org.apache.solr.common.SolrException: undefined field: 
> \"unknown-field\"",
> "trace":"java.lang.RuntimeException: 
> org.apache.solr.common.SolrException: undefined field: \"unknown-field\""}}
> {code}
>  
> On an unknown field, the system needs to return HTTP status code:400



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3089) Make ResponseBuilder.isDistrib public

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-3089:
---
Affects Version/s: (was: 4.0-ALPHA)

> Make ResponseBuilder.isDistrib public
> -
>
> Key: SOLR-3089
> URL: https://issues.apache.org/jira/browse/SOLR-3089
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Reporter: Rok Rejc
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: master (8.0), 7.3
>
> Attachments: SOLR-3089.patch, Solr-3089.patch
>
>
> Hi,
> i have posted this issue on a mailing list but i didn't get any response.
> I am trying to write distributed search component (a class that extends 
> SearchComponent). I have checked FacetComponent and TermsComponent. If I want 
> that search component works in a distributed environment I have to set 
> ResponseBuilder's isDistrib to true, like this (this is also done in 
> TermsComponent for example):
>   public void prepare(ResponseBuilder rb) throws IOException {
>   SolrParams params = rb.req.getParams();
>   String shards = params.get(ShardParams.SHARDS);
>   if (shards != null) {
>   List lst = StrUtils.splitSmart(shards, ",", 
> true);
>   rb.shards = lst.toArray(new String[lst.size()]);
>   rb.isDistrib = true;
>   }
>   }
> If I have my component outside the package org.apache.solr.handler.component 
> this doesn't work. Is it possible to make isDistrib public (or is this the 
> wrong procedure/behaviour/design)?
> Many thanks,
> Rok



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3089) Make ResponseBuilder.isDistrib public

2018-02-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-3089.

   Resolution: Fixed
 Assignee: Ishan Chattopadhyaya
Fix Version/s: (was: 6.0)
   (was: 4.9)
   7.3
   master (8.0)

Thanks to all.

> Make ResponseBuilder.isDistrib public
> -
>
> Key: SOLR-3089
> URL: https://issues.apache.org/jira/browse/SOLR-3089
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Reporter: Rok Rejc
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: master (8.0), 7.3
>
> Attachments: SOLR-3089.patch, Solr-3089.patch
>
>
> Hi,
> i have posted this issue on a mailing list but i didn't get any response.
> I am trying to write distributed search component (a class that extends 
> SearchComponent). I have checked FacetComponent and TermsComponent. If I want 
> that search component works in a distributed environment I have to set 
> ResponseBuilder's isDistrib to true, like this (this is also done in 
> TermsComponent for example):
>   public void prepare(ResponseBuilder rb) throws IOException {
>   SolrParams params = rb.req.getParams();
>   String shards = params.get(ShardParams.SHARDS);
>   if (shards != null) {
>   List lst = StrUtils.splitSmart(shards, ",", 
> true);
>   rb.shards = lst.toArray(new String[lst.size()]);
>   rb.isDistrib = true;
>   }
>   }
> If I have my component outside the package org.apache.solr.handler.component 
> this doesn't work. Is it possible to make isDistrib public (or is this the 
> wrong procedure/behaviour/design)?
> Many thanks,
> Rok



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11876) InPlace update fails when resolving from Tlog if schema has a required field

2018-02-08 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-11876:
---

Assignee: Ishan Chattopadhyaya

> InPlace update fails when resolving from Tlog if schema has a required field
> 
>
> Key: SOLR-11876
> URL: https://issues.apache.org/jira/browse/SOLR-11876
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
> Environment: OSX High Sierra
> java version "1.8.0_152"
> Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
>    Reporter: Justin Deoliveira
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11876.patch
>
>
> The situation is doing an in place update of a non-indexed/stored numeric doc 
> values field multiple times in fast succession. The schema 
> has a required field ("name") in it. On the third update request the update 
> fails complaining "missing required field: name". It seems this happens 
> when the update document is being resolved from from the TLog.
> To reproduce:
> 1. Setup a schema that has:
>     - A required field other than the uniquekey field, in my case it's called 
> "name"
>     - A numeric doc values field suitable for in place update (non-indexed, 
> non-stored), in my case it's called "likes"
> 2. Execute an in place update of the document a few times in fast succession:
> {noformat}
> for i in `seq 10`; do
> curl -X POST -H 'Content-Type: application/json' 
> 'http://localhost:8983/solr/core1/update' --data-binary '
> [{
>  "id": "1",
>  "likes": { "inc": 1 }
> }]'
> done{noformat}
> The resulting stack trace:
> {noformat}
> 2018-01-19 21:27:26.644 ERROR (qtp1873653341-14) [ x:core1] 
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: [doc=1] 
> missing required field: name
>  at 
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:233)
>  at 
> org.apache.solr.handler.component.RealTimeGetComponent.toSolrDoc(RealTimeGetComponent.java:767)
>  at 
> org.apache.solr.handler.component.RealTimeGetComponent.resolveFullDocument(RealTimeGetComponent.java:423)
>  at 
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocumentFromTlog(RealTimeGetComponent.java:551)
>  at 
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:609)
>  at 
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.doInPlaceUpdateMerge(AtomicUpdateDocumentMerger.java:253)
>  at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1279)
>  at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1008)
>  at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:617)
>  at 
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.FieldNameMutatingUpdateProcessorFactory$1.processAdd(FieldNameMutatingUpdateProcessorFactory.java:75)
>  at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
>  at 
> org.apache.solr.update.processor.Fiel

Re: Welcome Jason Gerlowski as committer

2018-02-08 Thread Ishan Chattopadhyaya

Congratulations Jason! :-)

On Thu, Feb 8, 2018 at 10:38 PM, Karl Wright  wrote:

> Hello Jason!
>
>
> On Thu, Feb 8, 2018 at 12:06 PM, Dawid Weiss 
> wrote:
>
>> Welcome Jason!
>>
>> Dawid
>>
>> On Thu, Feb 8, 2018 at 6:04 PM, Adrien Grand  wrote:
>> > Welcome Jason!
>> >
>> > Le jeu. 8 févr. 2018 à 18:03, David Smiley  a
>> > écrit :
>> >>
>> >> Hello everyone,
>> >>
>> >> It's my pleasure to announce that Jason Gerlowski is our latest
>> committer
>> >> for Lucene/Solr in recognition for his contributions to the project!
>> Please
>> >> join me in welcoming him.  Jason, it's tradition for you to introduce
>> >> yourself with a brief bio.
>> >>
>> >> Congratulations and Welcome!
>> >> --
>> >> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> >> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> >> http://www.solrenterprisesearchserver.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

[jira] [Commented] (SOLR-11459) AddUpdateCommand#prevVersion is not cleared which may lead to problem for in-place updates of non existed documents

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353172#comment-16353172
 ] 

Ishan Chattopadhyaya commented on SOLR-11459:
-

+1 to the patch. The failures seem unrelated.

> AddUpdateCommand#prevVersion is not cleared which may lead to problem for 
> in-place updates of non existed documents
> ---
>
> Key: SOLR-11459
> URL: https://issues.apache.org/jira/browse/SOLR-11459
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 7.0
>Reporter: Andrey Kudryavtsev
>Assignee: Mikhail Khludnev
>Priority: Minor
> Attachments: SOLR-11459.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have a 1_shard / *m*_replicas SolrCloud cluster with Solr 6.6.0 and run 
> batches of 5 - 10k in-place updates from time to time. 
> Once I noticed that job "hangs" - it started and couldn't finish for a a 
> while.
> Logs were full of messages like:
> {code} Missing update, on which current in-place update depends on, hasn't 
> arrived. id=__, looking for version=___, last found version=0"  {code}
> {code} 
> Tried to fetch document ___ from the leader, but the leader says document has 
> been deleted. Deleting the document here and skipping this update: Last found 
> version: 0, was looking for: ___",24,0,"but the leader says document has been 
> deleted. Deleting the document here and skipping this update: Last found 
> version: 0
> {code}
> Further analysis shows that:
> * There are 100-500 updates for non-existed documents among other updates 
> (something that I have to deal with)
> * Leader receives bunch of updates and executes this updates one by one. 
> {{JavabinLoader}} which is used by processing documents reuses same instance 
> of {{AddUpdateCommand}} for every update and just [clearing its state at the 
> end|https://github.com/apache/lucene-solr/blob/e2521b2a8baabdaf43b92192588f51e042d21e97/solr/core/src/java/org/apache/solr/handler/loader/JavabinLoader.java#L99].
>  Field [AddUpdateCommand#prevVersion| 
> https://github.com/apache/lucene-solr/blob/6396cb759f8c799f381b0730636fa412761030ce/solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java#L76]
>  is not cleared.   
> * In case of update is in-place update, but specified document does not 
> exist, this update is processed as a regular atomic update (i.e. new doc is 
> created), but {{prevVersion}} is used as a {{distrib.inplace.prevversion}} 
> parameter in sequential calls to every slave in DistributedUpdateProcessor. 
> {{prevVersion}} wasn't cleared, so it may contain version from previous 
> processed update.
> * Slaves checks it's own version of documents which is 0 (cause doc does not 
> exist), slave thinks that some updates were missed and spends 5 seconds in 
> [DistributedUpdateProcessor#waitForDependentUpdates|https://github.com/apache/lucene-solr/blob/e2521b2a8baabdaf43b92192588f51e042d21e97/solr/core/src/java/org/apache/solr/handler/loader/JavabinLoader.java#L99]
>  waiting for missed updates (no luck) and also tries to get "correct" version 
> from leader (no luck as well) 
> * So update for non existed document costs *m* * 5 sec each
> I workarounded this by explicit check of doc existence, but it probably 
> should be fixed.
> Obviously first guess is that  prevVersion should be cleared in 
> {{AddUpdateCommand#clear}}, but have no clue how to test it.
> {code}
> +++ solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java   
> (revision )
> @@ -78,6 +78,7 @@
>   updateTerm = null;
>   isLastDocInBatch = false;
>   version = 0;
> + prevVersion = -1;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352659#comment-16352659
 ] 

Ishan Chattopadhyaya commented on SOLR-10261:
-

Reverting to 6af020399bd144ab8f1a9ae97af85f07d8c1258d (February 2017 commit) 
and applying this patch causes the test to pass in 3 minutes. Somewhere between 
then and now, there has been this slowdown. I'll try to track this down.

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352586#comment-16352586
 ] 

Ishan Chattopadhyaya commented on SOLR-10261:
-

The patch seems to be doing the correct thing: upon a failed in-place update on 
a replica, it is sending the replica in LIR.

However, I don't understand the cause of the slowdown (43min as opposed to a 
few seconds). Earlier, when I first wrote this patch, I didn't see such a 
slowdown. I suspect this is a side-effect of some other change that has 
happened in the system (over past 8-10 months).

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10261) TestStressCloudBlindAtomicUpdates.test_dv() fail

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352543#comment-16352543
 ] 

Ishan Chattopadhyaya commented on SOLR-10261:
-

Sure [~steve_rowe], I'll take it up next.

> TestStressCloudBlindAtomicUpdates.test_dv() fail
> 
>
> Key: SOLR-10261
> URL: https://issues.apache.org/jira/browse/SOLR-10261
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-10261.patch, SOLR-10261.patch
>
>
> I found a reproducing seed that cause 
> TestStressCloudBlindAtomicUpdates.test_dv() fail
> {code}
> [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestStressCloudBlindAtomicUpdates -Dtests.method=test_dv 
> -Dtests.seed=AD8E7B56D53B627F -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=bg -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] ERROR   1.21s J2 | TestStressCloudBlindAtomicUpdates.test_dv <<<
>[junit4]> Throwable #1: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([AD8E7B56D53B627F:9B9A19105F66586E]:0)
>[junit4]>  at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>[junit4]>  at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:281)
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
>[junit4]>  at java.lang.Thread.run(Thread.java:745)
>[junit4]> Caused by: java.lang.RuntimeException: Error from server at 
> http://127.0.0.1:49825/solr/test_col: Async exception during distributed 
> update: Error from server at 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2: Server Error
>[junit4]> request: 
> http://127.0.0.1:49824/solr/test_col_shard2_replica2/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.0.1%3A49825%2Fsolr%2Ftest_col_shard5_replica1%2F=javabin=2
>[junit4]> Remote error message: Failed synchronous update on shard 
> StdNode: http://127.0.0.1:49836/solr/test_col_shard2_replica1/ update: 
> org.apache.solr.client.solrj.request.UpdateRequest@5919dfb3
>[junit4]>  at 
> org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates$Worker.run(TestStressCloudBlindAtomicUpdates.java:409)
>[junit4]>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>[junit4]>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>[junit4]>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>[junit4]>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>[junit4]>  ... 1 more
>[junit4]> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:49825/solr/test_col: Async exception during 
> distributed update: Error from server at 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11932) ZkCmdExecutor: Retry ZkOperation on SessionExpired

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352422#comment-16352422
 ] 

Ishan Chattopadhyaya commented on SOLR-11932:
-

Good catch; the fix looks good to me.

> ZkCmdExecutor: Retry ZkOperation on SessionExpired 
> ---
>
> Key: SOLR-11932
> URL: https://issues.apache.org/jira/browse/SOLR-11932
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: John Gallagher
>Priority: Major
> Attachments: SessionExpiredLog.txt, zk_retry.patch
>
>
> We are seeing situations where an operation, such as changing a replica's 
> state to active after a recovery, fails because the zk session has expired.
> However, these operations seem like they are retryable, because the 
> ZookeeperConnect receives an event that the session expired and tries to 
> reconnect.
> That makes the SessionExpired handling scenario seem very similar to the 
> ConnectionLoss handling scenario, so the ZkCmdExecutor seems like it could 
> handle them in the same way.
>  
> Here's an example stack trace with some slight redactions: 
> [^SessionExpiredLog.txt]  In this case, a zk operation (a read) failed with a 
> SessionExpired event, which seems retriable.  The exception kicked off a 
> reconnection, but seems like the subsequent operation, (publishing as active) 
> failed (perhaps it was using a stale connection handle at that point?)
>  
> Regardless, the watch mechanism that reestablishes connection on 
> SessionExpired seems sufficient to allow the ZkCmdExecutor to retry that 
> operation at a later time and have hope of succeeding.
>  
> I have included a simple patch we are trying that catches both exceptions 
> instead of just ConnectionLossException: [^zk_retry.patch]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11932) ZkCmdExecutor: Retry ZkOperation on SessionExpired

2018-02-05 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-11932:
---

Assignee: Ishan Chattopadhyaya

> ZkCmdExecutor: Retry ZkOperation on SessionExpired 
> ---
>
> Key: SOLR-11932
> URL: https://issues.apache.org/jira/browse/SOLR-11932
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: John Gallagher
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SessionExpiredLog.txt, zk_retry.patch
>
>
> We are seeing situations where an operation, such as changing a replica's 
> state to active after a recovery, fails because the zk session has expired.
> However, these operations seem like they are retryable, because the 
> ZookeeperConnect receives an event that the session expired and tries to 
> reconnect.
> That makes the SessionExpired handling scenario seem very similar to the 
> ConnectionLoss handling scenario, so the ZkCmdExecutor seems like it could 
> handle them in the same way.
>  
> Here's an example stack trace with some slight redactions: 
> [^SessionExpiredLog.txt]  In this case, a zk operation (a read) failed with a 
> SessionExpired event, which seems retriable.  The exception kicked off a 
> reconnection, but seems like the subsequent operation, (publishing as active) 
> failed (perhaps it was using a stale connection handle at that point?)
>  
> Regardless, the watch mechanism that reestablishes connection on 
> SessionExpired seems sufficient to allow the ZkCmdExecutor to retry that 
> operation at a later time and have hope of succeeding.
>  
> I have included a simple patch we are trying that catches both exceptions 
> instead of just ConnectionLossException: [^zk_retry.patch]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11920) Differential file copy for IndexFetcher

2018-01-27 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342364#comment-16342364
 ] 

Ishan Chattopadhyaya commented on SOLR-11920:
-

Thanks for looking at this.

bq. .. local copy instead of the hard link approach ...
Need to verify if this works on Windows. If not, maybe move could also work 
here?

bq. Maybe we should turn the two INFO log lines into DEBUG? I know they are 
currently INFO as well but seems like we should change it
I'm fine either ways. On the field, I've found this to be helpful information 
looking at logs from production servers. Though, if you insist, I'm okay with 
making them DEBUG.

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11920) Differential file copy for IndexFetcher

2018-01-27 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-11920:

Attachment: SOLR-11920.patch

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch, SOLR-11920.patch
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-11815) TLOG leaders going down and rejoining as a replica do fullCopy when not needed

2018-01-27 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321251#comment-16321251
 ] 

Ishan Chattopadhyaya edited comment on SOLR-11815 at 1/27/18 10:41 PM:
---

-Adding a WIP patch that mitigates this issue. In the event of a full copy, I'm 
doing a differential copy: download whatever is new/different, but re-use the 
files that are same. It doesn't have tests, and brittle right now (naive class 
cast).-
EDIT: Moving this to its own issue, SOLR-11920.

This issue is to find a way to avoid fullCopy=true, since it seems wasteful. 
However, if the process of doing the fullCopy itself is optimized (SOLR-11920), 
then fixing this might not be necessary.


was (Author: ichattopadhyaya):
Adding a WIP patch that mitigates this issue. In the event of a full copy, I'm 
doing a differential copy: download whatever is new/different, but re-use the 
files that are same. It doesn't have tests, and brittle right now (naive class 
cast).

> TLOG leaders going down and rejoining as a replica do fullCopy when not needed
> --
>
> Key: SOLR-11815
> URL: https://issues.apache.org/jira/browse/SOLR-11815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Affects Versions: 7.2
> Environment: Oracle JDK 1.8
> Ubuntu 16.04
>Reporter: Shaun Sabo
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11815.patch
>
>
> I am running a collection with a persistent high volume of writes. When the 
> leader goes down and recovers, it joins as a replica and asks the new leader 
> for the files to Sync. The isIndexStale check is finding that some files 
> differ in size and checksum which forces a fullCopy. Since our indexes are 
> rather large, a rolling restart is resulting in large amounts of data 
> transfer, and in some cases disk space contention issues.
> I do not believe the fullCopy is necessary given the circumstances. 
> Repro Steps:
> 1. collection/shard with 1 leader and 1 replica are accepting writes
> - Pull interval is 30 seconds
> - Hard Commit interval is 60 seconds
> 2. Replica executes an index pull and completes. 
> 3. Leader process Hard Commits (replica index is delayed)
> 4. leader process is killed (SIGTERM)
> 5. Replica takes over as new leader
> 6. New leader applies TLOG since last pull (cores are binary-divergent now)
> 7. Former leader comes back as New Replica
> 8. New replica initiates recovery
> - Recovery detects that the generation and version are behind and a check 
> is necessary
> 9. isIndexStale() detects that a segment exists on both the New Replica and 
> New Leader but that the size and checksum differ. 
> - This triggers fullCopy to be flagged on
> 10. Entirety of index is pulled regardless of changes
> The majority of files should not have changes, but everything gets pulled 
> because of the first file it finds with a mismatched checksum. 
> Relevant Code:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L516-L518
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L1105-L1126



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11920) Differential file copy for IndexFetcher

2018-01-27 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-11920:

Attachment: SOLR-11920.patch

> Differential file copy for IndexFetcher
> ---
>
> Key: SOLR-11920
> URL: https://issues.apache.org/jira/browse/SOLR-11920
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11920.patch
>
>
> In the case of fullCopy=true, all files are copied over from the 
> leader/master irrespective of whether or not that exact file exists with the 
> replica/slave. This is wasteful, esp. in tlog replicas or pull replicas, when 
> only a fraction of the total files are different.
> This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11920) Differential file copy for IndexFetcher

2018-01-27 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-11920:
---

 Summary: Differential file copy for IndexFetcher
 Key: SOLR-11920
 URL: https://issues.apache.org/jira/browse/SOLR-11920
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya
Assignee: Ishan Chattopadhyaya


In the case of fullCopy=true, all files are copied over from the leader/master 
irrespective of whether or not that exact file exists with the replica/slave. 
This is wasteful, esp. in tlog replicas or pull replicas, when only a fraction 
of the total files are different.

This stems from SOLR-11815.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-01-26 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-8327:
--

Assignee: Ishan Chattopadhyaya  (was: Varun Thacker)

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: solrcloud
> Attachments: SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-01-26 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341284#comment-16341284
 ] 

Ishan Chattopadhyaya commented on SOLR-8327:


bq. The following code snippet would fetch the state and parse it irrespective 
of whether the state is updated. It should download the changed state only if 
the znode version is changed
I couldn't find a way to do this without two calls to ZK. Noble/John, unless 
there's an easy optimization that can be done here, do you think we should just 
go with the current patch and optimize later (say, using the smart caching 
technique)?

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>Assignee: Varun Thacker
>Priority: Major
>  Labels: solrcloud
> Attachments: SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

UNCHECKED [jira] [Commented] (SOLR-11828) Solr tests fail on Fedora 26, 27

2018-01-26 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341256#comment-16341256
 ] 

Ishan Chattopadhyaya commented on SOLR-11828:
-

I kept Fedora 26 updated, and, to my surprise, the tests are passing fine now 
:-)
I upgraded to 27, and the same story. Perhaps some latest update fixed this.

> Solr tests fail on Fedora 26, 27
> 
>
> Key: SOLR-11828
> URL: https://issues.apache.org/jira/browse/SOLR-11828
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> This may be a non-Solr issue, but I am not fully sure. I see tons of test 
> failures on Fedora 26 and 27, but everything is fine on Fedora 25. This is 
> the case even when the same kernel version was used for both 25 and 26 
> (passed on 25, failed on 26). Reasons of failure seem to be ZK connection 
> loss. Using docker container for Fedora 25 seems to work.
> Filing a JIRA just so that someone can investigate and also so that someone 
> avoids using Solr on production on these distributions, until a fix is found.
> BTW, [~gus_heck] reported that he saw similar issues with Ubuntu 17.04:
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358682.html
> Here's some discussion:
> Ishan's initial post (I mistook this to be a kernel issue at first):
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358603.html 
> Uwe's post: 
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358712.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-11828) Solr tests fail on Fedora 26, 27

2018-01-26 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-11828.
-
Resolution: Cannot Reproduce
  Assignee: Ishan Chattopadhyaya

> Solr tests fail on Fedora 26, 27
> 
>
> Key: SOLR-11828
> URL: https://issues.apache.org/jira/browse/SOLR-11828
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> This may be a non-Solr issue, but I am not fully sure. I see tons of test 
> failures on Fedora 26 and 27, but everything is fine on Fedora 25. This is 
> the case even when the same kernel version was used for both 25 and 26 
> (passed on 25, failed on 26). Reasons of failure seem to be ZK connection 
> loss. Using docker container for Fedora 25 seems to work.
> Filing a JIRA just so that someone can investigate and also so that someone 
> avoids using Solr on production on these distributions, until a fix is found.
> BTW, [~gus_heck] reported that he saw similar issues with Ubuntu 17.04:
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358682.html
> Here's some discussion:
> Ishan's initial post (I mistook this to be a kernel issue at first):
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358603.html 
> Uwe's post: 
> http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358712.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-7.x-Linux (64bit/jdk-10-ea+37) - Build # 1226 - Still Unstable!

2018-01-24 Thread Ishan Chattopadhyaya

Interesting.
> Say... where is the code that initializes _default to be in ZK?
Its here:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L716

On Thu, Jan 25, 2018 at 1:29 AM, David Smiley <david.w.smi...@gmail.com>
wrote:

> I looked at the logs.  From what I can see, solrconfig.xml is never copied
> from _default to the new "timeConfig" config.  The CREATE configset appears
> to complete without this happening since the log line that would have
> copied it does not appear before (or after for that matter).  I am not sure
> if the solrconfig.xml inexplicably wasn't there in the first place or if it
> might have been inexplicably omitted for the copy to a new configset.
> Say... where is the code that initializes _default to be in ZK?
>
> On Tue, Jan 23, 2018 at 3:02 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> I'll take a look. Maybe the commit for SOLR-11624 broke it.
>>
>> On Tue, Jan 23, 2018 at 11:43 AM, Policeman Jenkins Server <
>> jenk...@thetaphi.de> wrote:
>>
>>> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/1226/
>>> Java: 64bit/jdk-10-ea+37 -XX:+UseCompressedOops -XX:+UseSerialGC
>>>
>>> 2 tests failed.
>>> FAILED:  org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessor
>>> Test.test
>>>
>>>
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>

[jira] [Resolved] (SOLR-11624) collection creation should not also overwrite/delete any configset but it can!

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-11624.
-
   Resolution: Fixed
Fix Version/s: 7.3
   master (8.0)

> collection creation should not also overwrite/delete any configset but it can!
> --
>
> Key: SOLR-11624
> URL: https://issues.apache.org/jira/browse/SOLR-11624
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Erick Erickson
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: master (8.0), 7.3
>
> Attachments: SOLR-11624-2.patch, SOLR-11624.3.patch, 
> SOLR-11624.4.patch, SOLR-11624.patch, SOLR-11624.patch, SOLR-11624.patch, 
> SOLR-11624.patch
>
>
> Looks like a problem that crept in when we changed the _default configset 
> stuff.
> setup:
> upload a configset named "wiki"
> collections?action=CREATE=wiki&.
> My custom configset "wiki" gets overwritten by _default and then used by the 
> "wiki" collection.
> Assigning to myself only because it really needs to be fixed IMO and I don't 
> want to lose track of it. Anyone else please feel free to take it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6630) Deprecate the "implicit" router and rename to "manual"

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335093#comment-16335093
 ] 

Ishan Chattopadhyaya commented on SOLR-6630:


bq. ping
Had dropped the ball on this; I'll pick it up next (after about 2-3 issues I'm 
currently working on).

> Deprecate the "implicit" router and rename to "manual"
> --
>
> Key: SOLR-6630
> URL: https://issues.apache.org/jira/browse/SOLR-6630
> Project: Solr
>  Issue Type: Task
>  Components: SolrCloud
>    Reporter: Shawn Heisey
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.0
>
> Attachments: SOLR-6630.patch, SOLR-6630.patch
>
>
> I had this exchange with an IRC user named "kindkid" this morning:
> {noformat}
> 08:30 < kindkid> I'm using sharding with the implicit router, but I'm seeing
>  all my documents end up on just one of my 24 shards. What
>  might be causing this? (4.10.0)
> 08:35 <@elyograg> kindkid: you used the implicit router.  that means that
>   documents will be indexed on the shard you sent them
> to, not
>   routed elsewhere.
> 08:37 < kindkid> oh. wow. not sure where I got the idea, but I was under the
>  impression that implicit router would use a hash of the
>  uniqueKey modulo number of shards to pick a shard.
> 08:38 <@elyograg> I think you probably wanted the compositeId router.
> 08:39 <@elyograg> implicit is not a very good name.  It's technically
> correct,
>   but the meaning of the word is not well known.
> 08:39 <@elyograg> "manual" would be a better name.
> {noformat}
> The word "implicit" has a very specific meaning, and I think it's
> absolutely correct terminology for what it does, but I don't think that
> it's very clear to a typical person.  This is not the first time I've
> encountered the confusion.
> Could we deprecate the implicit name and use something much more
> descriptive and easily understood, like "manual" instead?  Let's go
> ahead and accept implicit in 5.x releases, but issue a warning in the
> log.  Maybe we can have a startup system property or a config option
> that will force the name to be updated in zookeeper and get rid of the
> warning.  If we do this, my bias is to have an upgrade to 6.x force the
> name change in zookeeper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6630) Deprecate the "implicit" router and rename to "manual"

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335093#comment-16335093
 ] 

Ishan Chattopadhyaya edited comment on SOLR-6630 at 1/22/18 10:53 PM:
--

bq. ping
Had dropped the ball on this; I'll pick it up after about 2-3 issues I'm 
currently working on.


was (Author: ichattopadhyaya):
bq. ping
Had dropped the ball on this; I'll pick it up next (after about 2-3 issues I'm 
currently working on).

> Deprecate the "implicit" router and rename to "manual"
> --
>
> Key: SOLR-6630
> URL: https://issues.apache.org/jira/browse/SOLR-6630
> Project: Solr
>  Issue Type: Task
>  Components: SolrCloud
>    Reporter: Shawn Heisey
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Fix For: 7.0
>
> Attachments: SOLR-6630.patch, SOLR-6630.patch
>
>
> I had this exchange with an IRC user named "kindkid" this morning:
> {noformat}
> 08:30 < kindkid> I'm using sharding with the implicit router, but I'm seeing
>  all my documents end up on just one of my 24 shards. What
>  might be causing this? (4.10.0)
> 08:35 <@elyograg> kindkid: you used the implicit router.  that means that
>   documents will be indexed on the shard you sent them
> to, not
>   routed elsewhere.
> 08:37 < kindkid> oh. wow. not sure where I got the idea, but I was under the
>  impression that implicit router would use a hash of the
>  uniqueKey modulo number of shards to pick a shard.
> 08:38 <@elyograg> I think you probably wanted the compositeId router.
> 08:39 <@elyograg> implicit is not a very good name.  It's technically
> correct,
>   but the meaning of the word is not well known.
> 08:39 <@elyograg> "manual" would be a better name.
> {noformat}
> The word "implicit" has a very specific meaning, and I think it's
> absolutely correct terminology for what it does, but I don't think that
> it's very clear to a typical person.  This is not the first time I've
> encountered the confusion.
> Could we deprecate the implicit name and use something much more
> descriptive and easily understood, like "manual" instead?  Let's go
> ahead and accept implicit in 5.x releases, but issue a warning in the
> log.  Maybe we can have a startup system property or a config option
> that will force the name to be updated in zookeeper and get rid of the
> warning.  If we do this, my bias is to have an upgrade to 6.x force the
> name change in zookeeper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11624) collection creation should not also overwrite/delete any configset but it can!

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334378#comment-16334378
 ] 

Ishan Chattopadhyaya commented on SOLR-11624:
-

Thanks [~abhidemon] for the patch (test+documentation) and thanks to Erick, 
David & Shawn for reviewing the approaches / patches.

> collection creation should not also overwrite/delete any configset but it can!
> --
>
> Key: SOLR-11624
> URL: https://issues.apache.org/jira/browse/SOLR-11624
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Erick Erickson
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11624-2.patch, SOLR-11624.3.patch, 
> SOLR-11624.4.patch, SOLR-11624.patch, SOLR-11624.patch, SOLR-11624.patch, 
> SOLR-11624.patch
>
>
> Looks like a problem that crept in when we changed the _default configset 
> stuff.
> setup:
> upload a configset named "wiki"
> collections?action=CREATE=wiki&.
> My custom configset "wiki" gets overwritten by _default and then used by the 
> "wiki" collection.
> Assigning to myself only because it really needs to be fixed IMO and I don't 
> want to lose track of it. Anyone else please feel free to take it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11624) collection creation should not also overwrite/delete any configset but it can!

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334125#comment-16334125
 ] 

Ishan Chattopadhyaya commented on SOLR-11624:
-

Updated the patch for master. 

> collection creation should not also overwrite/delete any configset but it can!
> --
>
> Key: SOLR-11624
> URL: https://issues.apache.org/jira/browse/SOLR-11624
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Erick Erickson
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11624-2.patch, SOLR-11624.3.patch, 
> SOLR-11624.4.patch, SOLR-11624.patch, SOLR-11624.patch, SOLR-11624.patch, 
> SOLR-11624.patch
>
>
> Looks like a problem that crept in when we changed the _default configset 
> stuff.
> setup:
> upload a configset named "wiki"
> collections?action=CREATE=wiki&.
> My custom configset "wiki" gets overwritten by _default and then used by the 
> "wiki" collection.
> Assigning to myself only because it really needs to be fixed IMO and I don't 
> want to lose track of it. Anyone else please feel free to take it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11624) collection creation should not also overwrite/delete any configset but it can!

2018-01-22 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-11624:

Attachment: SOLR-11624.patch

> collection creation should not also overwrite/delete any configset but it can!
> --
>
> Key: SOLR-11624
> URL: https://issues.apache.org/jira/browse/SOLR-11624
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Erick Erickson
>    Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-11624-2.patch, SOLR-11624.3.patch, 
> SOLR-11624.4.patch, SOLR-11624.patch, SOLR-11624.patch, SOLR-11624.patch, 
> SOLR-11624.patch
>
>
> Looks like a problem that crept in when we changed the _default configset 
> stuff.
> setup:
> upload a configset named "wiki"
> collections?action=CREATE=wiki&.
> My custom configset "wiki" gets overwritten by _default and then used by the 
> "wiki" collection.
> Assigning to myself only because it really needs to be fixed IMO and I don't 
> want to lose track of it. Anyone else please feel free to take it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Solr block join query not giving results

2018-01-17 Thread Ishan Chattopadhyaya

> So, _root_ value is created by me not internally by solr. Would that
create a problem.

I think this is the reason the index is corrupted.

On Wed, Jan 17, 2018 at 5:20 PM, Aashish Agarwal 
wrote:

> No it should not be the case because the query is working for price:[0 TO
> 10] and price:[10 TO 20] so index are fine for price:[0 TO 20] still query
> fails.
>
> I user the csv to import data as described in
> https://gist.github.com/mkhludnev/6406734#file-t-shirts-xml
> So, _root_ value is created by me not internally by solr. Would that
> create a problem.
>
> Thanks,
> Aashish
>
> On Jan 17, 2018 4:46 PM, "Mikhail Khludnev"  wrote:
>
>> Sounds like corrupted index.
>> https://issues.apache.org/jira/browse/SOLR-7606
>>
>> On Wed, Jan 17, 2018 at 9:00 AM, Aashish Agarwal 
>> wrote:
>>
>>> Hi,
>>>
>>> I am using block join query to get parent object using filter on child.
>>> But when the number of results are large than the query fails with
>>> ArrayIndexOutOfBoundException. e.g in range query price:[0 TO 20] fails but
>>> price[0 TO 10], price:[10 TO 20] works fine. I am using solr 4.6.0.
>>>
>>> Thanks,
>>> Aashish
>>>
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>

[jira] [Commented] (SOLR-8327) SolrDispatchFilter is not caching new state format, which results in live fetch from ZK per request if node does not contain core from collection

2018-01-16 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327615#comment-16327615
 ] 

Ishan Chattopadhyaya commented on SOLR-8327:


I plan to commit this shortly, unless there are any objections. We can open a 
follow up Jira to tackle the smart caching here, which will be an improvement 
over this.

> SolrDispatchFilter is not caching new state format, which results in live 
> fetch from ZK per request if node does not contain core from collection
> -
>
> Key: SOLR-8327
> URL: https://issues.apache.org/jira/browse/SOLR-8327
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Jessica Cheng Mallet
>Assignee: Varun Thacker
>Priority: Major
>  Labels: solrcloud
> Attachments: SOLR-8327.patch
>
>
> While perf testing with non-solrj client (request can be sent to any solr 
> node), we noticed a huge amount of data from Zookeeper in our tcpdump (~1G 
> for 20 second dump). From the thread dump, we noticed this:
> java.lang.Object.wait (Native Method)
> java.lang.Object.wait (Object.java:503)
> org.apache.zookeeper.ClientCnxn.submitRequest (ClientCnxn.java:1309)
> org.apache.zookeeper.ZooKeeper.getData (ZooKeeper.java:1152)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:345)
> org.apache.solr.common.cloud.SolrZkClient$7.execute (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation 
> (ZkCmdExecutor.java:61)
> org.apache.solr.common.cloud.SolrZkClient.getData (SolrZkClient.java:342)
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive 
> (ZkStateReader.java:841)
> org.apache.solr.common.cloud.ZkStateReader$7.get (ZkStateReader.java:515)
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull 
> (ClusterState.java:175)
> org.apache.solr.common.cloud.ClusterState.getLeader (ClusterState.java:98)
> org.apache.solr.servlet.HttpSolrCall.getCoreByCollection 
> (HttpSolrCall.java:784)
> org.apache.solr.servlet.HttpSolrCall.init (HttpSolrCall.java:272)
> org.apache.solr.servlet.HttpSolrCall.call (HttpSolrCall.java:417)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:210)
> org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:179)
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter 
> (ServletHandler.java:1652)
> org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:585)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:143)
> org.eclipse.jetty.security.SecurityHandler.handle (SecurityHandler.java:577)
> org.eclipse.jetty.server.session.SessionHandler.doHandle 
> (SessionHandler.java:223)
> org.eclipse.jetty.server.handler.ContextHandler.doHandle 
> (ContextHandler.java:1127)
> org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java:515)
> org.eclipse.jetty.server.session.SessionHandler.doScope 
> (SessionHandler.java:185)
> org.eclipse.jetty.server.handler.ContextHandler.doScope 
> (ContextHandler.java:1061)
> org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:141)
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle 
> (ContextHandlerCollection.java:215)
> org.eclipse.jetty.server.handler.HandlerCollection.handle 
> (HandlerCollection.java:110)
> org.eclipse.jetty.server.handler.HandlerWrapper.handle 
> (HandlerWrapper.java:97)
> org.eclipse.jetty.server.Server.handle (Server.java:499)
> org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:310)
> org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:257)
> org.eclipse.jetty.io.AbstractConnection$2.run (AbstractConnection.java:540)
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob 
> (QueuedThreadPool.java:635)
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run 
> (QueuedThreadPool.java:555)
> java.lang.Thread.run (Thread.java:745)
> Looks like SolrDispatchFilter doesn't have caching similar to the 
> collectionStateCache in CloudSolrClient, so if the node doesn't know about a 
> collection in the new state format, it just live-fetch it from Zookeeper on 
> every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 7.2.1 RC1

2018-01-13 Thread Ishan Chattopadhyaya

This also happens with 7.2.0 and 7.1.0. Could be something to do with the
official Java image. Nothing that stops the RC, I think.

On Sat, Jan 13, 2018 at 5:11 PM, Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> I spun up a docker container with Java 9 (java:9-jdk) from docker hub [0].
> Downloaded the Solr 7.2.1 RC1 tarball and unzipped it. Tried to start it,
> but it failed citing some crypto issue:
> https://gist.github.com/anonymous/ed1a179b1043190b5f6fd635c6a47f23
>
> I'm trying out the same for 7.2.0 and earlier versions to see if this is a
> recent regression.
>
>
> [0] - docker run -it java:9-jdk
>
> On Wed, Jan 10, 2018 at 11:04 PM, Adrien Grand <jpou...@gmail.com> wrote:
>
>> +1
>>
>> SUCCESS! [1:29:47.999770]
>>
>> Le mer. 10 janv. 2018 à 18:03, Tomas Fernandez Lobbe <tflo...@apple.com>
>> a écrit :
>>
>>> +1
>>>
>>> SUCCESS! [1:04:34.912689]
>>>
>>> On Jan 10, 2018, at 8:01 AM, Alan Woodward <romseyg...@gmail.com> wrote:
>>>
>>> +1
>>>
>>> SUCCESS! [1:43:16.772919]
>>>
>>> I need to get a new test machine...
>>>
>>> On 10 Jan 2018, at 09:51, Dawid Weiss <dawid.we...@gmail.com> wrote:
>>>
>>> +1
>>>
>>> SUCCESS! [1:31:30.029815]
>>>
>>> Dawid
>>>
>>> On Wed, Jan 10, 2018 at 10:46 AM, Shalin Shekhar Mangar
>>> <shalinman...@gmail.com> wrote:
>>>
>>> +1
>>>
>>> SUCCESS! [1:13:22.042124]
>>>
>>> On Wed, Jan 10, 2018 at 8:00 AM, jim ferenczi <jim.feren...@gmail.com>
>>> wrote:
>>>
>>> Please vote for release candidate 1 for Lucene/Solr 7.2.1
>>>
>>> The artifacts can be downloaded from:
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.
>>> 2.1-RC1-revb2b6438b37073bee1fca40374e85bf91aa457c0b
>>>
>>> You can run the smoke tester directly with this command:
>>>
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.
>>> 2.1-RC1-revb2b6438b37073bee1fca40374e85bf91aa457c0b
>>>
>>> Here's my +1
>>> SUCCESS! [0:38:10.689623]
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> <dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> <dev-h...@lucene.apache.org>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> <dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> <dev-h...@lucene.apache.org>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> <dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> <dev-h...@lucene.apache.org>
>>>
>>>
>>>
>

Re: [VOTE] Release Lucene/Solr 7.2.1 RC1

2018-01-13 Thread Ishan Chattopadhyaya

I spun up a docker container with Java 9 (java:9-jdk) from docker hub [0].
Downloaded the Solr 7.2.1 RC1 tarball and unzipped it. Tried to start it,
but it failed citing some crypto issue:
https://gist.github.com/anonymous/ed1a179b1043190b5f6fd635c6a47f23

I'm trying out the same for 7.2.0 and earlier versions to see if this is a
recent regression.


[0] - docker run -it java:9-jdk

On Wed, Jan 10, 2018 at 11:04 PM, Adrien Grand  wrote:

> +1
>
> SUCCESS! [1:29:47.999770]
>
> Le mer. 10 janv. 2018 à 18:03, Tomas Fernandez Lobbe 
> a écrit :
>
>> +1
>>
>> SUCCESS! [1:04:34.912689]
>>
>> On Jan 10, 2018, at 8:01 AM, Alan Woodward  wrote:
>>
>> +1
>>
>> SUCCESS! [1:43:16.772919]
>>
>> I need to get a new test machine...
>>
>> On 10 Jan 2018, at 09:51, Dawid Weiss  wrote:
>>
>> +1
>>
>> SUCCESS! [1:31:30.029815]
>>
>> Dawid
>>
>> On Wed, Jan 10, 2018 at 10:46 AM, Shalin Shekhar Mangar
>>  wrote:
>>
>> +1
>>
>> SUCCESS! [1:13:22.042124]
>>
>> On Wed, Jan 10, 2018 at 8:00 AM, jim ferenczi 
>> wrote:
>>
>> Please vote for release candidate 1 for Lucene/Solr 7.2.1
>>
>> The artifacts can be downloaded from:
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.2.1-RC1-
>> revb2b6438b37073bee1fca40374e85bf91aa457c0b
>>
>> You can run the smoke tester directly with this command:
>>
>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.2.1-RC1-
>> revb2b6438b37073bee1fca40374e85bf91aa457c0b
>>
>> Here's my +1
>> SUCCESS! [0:38:10.689623]
>>
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>>

[jira] [Commented] (SOLR-11624) collection creation should not also overwrite/delete any configset but it can!

2018-01-10 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321260#comment-16321260
 ] 

Ishan Chattopadhyaya commented on SOLR-11624:
-

[~erickerickson], I'll review the test and the documentation change and try to 
wrap this up this week.

> collection creation should not also overwrite/delete any configset but it can!
> --
>
> Key: SOLR-11624
> URL: https://issues.apache.org/jira/browse/SOLR-11624
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Erick Erickson
>    Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-11624-2.patch, SOLR-11624.3.patch, 
> SOLR-11624.4.patch, SOLR-11624.patch, SOLR-11624.patch
>
>
> Looks like a problem that crept in when we changed the _default configset 
> stuff.
> setup:
> upload a configset named "wiki"
> collections?action=CREATE=wiki&.
> My custom configset "wiki" gets overwritten by _default and then used by the 
> "wiki" collection.
> Assigning to myself only because it really needs to be fixed IMO and I don't 
> want to lose track of it. Anyone else please feel free to take it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11815) TLOG leaders going down and rejoining as a replica do fullCopy when not needed

2018-01-10 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-11815:

Attachment: SOLR-11815.patch

Adding a WIP patch that mitigates this issue. In the event of a full copy, I'm 
doing a differential copy: download whatever is new/different, but re-use the 
files that are same. It doesn't have tests, and brittle right now (naive class 
cast).

> TLOG leaders going down and rejoining as a replica do fullCopy when not needed
> --
>
> Key: SOLR-11815
> URL: https://issues.apache.org/jira/browse/SOLR-11815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Affects Versions: 7.2
> Environment: Oracle JDK 1.8
> Ubuntu 16.04
>Reporter: Shaun Sabo
>Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-11815.patch
>
>
> I am running a collection with a persistent high volume of writes. When the 
> leader goes down and recovers, it joins as a replica and asks the new leader 
> for the files to Sync. The isIndexStale check is finding that some files 
> differ in size and checksum which forces a fullCopy. Since our indexes are 
> rather large, a rolling restart is resulting in large amounts of data 
> transfer, and in some cases disk space contention issues.
> I do not believe the fullCopy is necessary given the circumstances. 
> Repro Steps:
> 1. collection/shard with 1 leader and 1 replica are accepting writes
> - Pull interval is 30 seconds
> - Hard Commit interval is 60 seconds
> 2. Replica executes an index pull and completes. 
> 3. Leader process Hard Commits (replica index is delayed)
> 4. leader process is killed (SIGTERM)
> 5. Replica takes over as new leader
> 6. New leader applies TLOG since last pull (cores are binary-divergent now)
> 7. Former leader comes back as New Replica
> 8. New replica initiates recovery
> - Recovery detects that the generation and version are behind and a check 
> is necessary
> 9. isIndexStale() detects that a segment exists on both the New Replica and 
> New Leader but that the size and checksum differ. 
> - This triggers fullCopy to be flagged on
> 10. Entirety of index is pulled regardless of changes
> The majority of files should not have changes, but everything gets pulled 
> because of the first file it finds with a mismatched checksum. 
> Relevant Code:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L516-L518
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L1105-L1126



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11837) More information required in README.md for Setting up project in IDEs

2018-01-09 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-11837:
---

Assignee: Ishan Chattopadhyaya

> More information required in README.md for Setting up project in IDEs 
> --
>
> Key: SOLR-11837
> URL: https://issues.apache.org/jira/browse/SOLR-11837
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.2
>Reporter: Abhishek Kumar Singh
>    Assignee: Ishan Chattopadhyaya
>  Labels: documentation
> Attachments: SOLR-11837.patch
>
>
> Sometimes, the instructions mentioned on the README.md page is not enough to 
> set up the project in the IDEs.
> The following *solr-wiki-page-links* are pretty useful, but are not present 
> on the README.md page.
> https://wiki.apache.org/solr/HowToConfigureEclipse
> https://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ
> https://wiki.apache.org/lucene-java/HowtoConfigureNetbeans
> Having links on the README.md page will be quite helpful for beginners. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11741) Offline training mode for schema guessing

2018-01-07 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315462#comment-16315462
 ] 

Ishan Chattopadhyaya commented on SOLR-11741:
-

bq. This will be the mapping of field -> supported types. 

At every point in time, every field will be mapped to only *one* possible (most 
granular) field type, isn't it? 


> Offline training mode for schema guessing
> -
>
> Key: SOLR-11741
> URL: https://issues.apache.org/jira/browse/SOLR-11741
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> Our data driven schema guessing doesn't work under many situations. For 
> example, if the first document has a field with value "0", it is guessed as 
> Long and subsequent fields with "0.0" are rejected. Similarly, if the same 
> field had alphanumeric contents for a latter document, those documents are 
> rejected. Also, single vs. multi valued field guessing is not ideal.
> Proposing an offline training mode where Solr accepts bunch of documents and 
> returns a guessed schema (without indexing). This schema can then be used for 
> actual indexing. I think the original idea is from Hoss.
> I think initial implementation can be based on an UpdateRequestProcessor. We 
> can hash out the API soon, as we go along.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3089) Make ResponseBuilder.isDistrib public

2018-01-06 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314595#comment-16314595
 ] 

Ishan Chattopadhyaya commented on SOLR-3089:


I'm going to commit this change, unless someone feels otherwise. This is useful 
for plugin developers.

> Make ResponseBuilder.isDistrib public
> -
>
> Key: SOLR-3089
> URL: https://issues.apache.org/jira/browse/SOLR-3089
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Affects Versions: 4.0-ALPHA
>Reporter: Rok Rejc
> Fix For: 4.9, 6.0
>
> Attachments: Solr-3089.patch
>
>
> Hi,
> i have posted this issue on a mailing list but i didn't get any response.
> I am trying to write distributed search component (a class that extends 
> SearchComponent). I have checked FacetComponent and TermsComponent. If I want 
> that search component works in a distributed environment I have to set 
> ResponseBuilder's isDistrib to true, like this (this is also done in 
> TermsComponent for example):
>   public void prepare(ResponseBuilder rb) throws IOException {
>   SolrParams params = rb.req.getParams();
>   String shards = params.get(ShardParams.SHARDS);
>   if (shards != null) {
>   List lst = StrUtils.splitSmart(shards, ",", 
> true);
>   rb.shards = lst.toArray(new String[lst.size()]);
>   rb.isDistrib = true;
>   }
>   }
> If I have my component outside the package org.apache.solr.handler.component 
> this doesn't work. Is it possible to make isDistrib public (or is this the 
> wrong procedure/behaviour/design)?
> Many thanks,
> Rok



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11828) Solr tests fail on Fedora 26, 27

2018-01-06 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-11828:
---

 Summary: Solr tests fail on Fedora 26, 27
 Key: SOLR-11828
 URL: https://issues.apache.org/jira/browse/SOLR-11828
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya


This may be a non-Solr issue, but I am not fully sure. I see tons of test 
failures on Fedora 26 and 27, but everything is fine on Fedora 25. This is the 
case even when the same kernel version was used for both 25 and 26 (passed on 
25, failed on 26). Reasons of failure seem to be ZK connection loss. Using 
docker container for Fedora 25 seems to work.

Filing a JIRA just so that someone can investigate and also so that someone 
avoids using Solr on production on these distributions, until a fix is found.

BTW, [~gus_heck] reported that he saw similar issues with Ubuntu 17.04:
http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358682.html

Here's some discussion:
Ishan's initial post (I mistook this to be a kernel issue at first):
http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358603.html 
Uwe's post: 
http://lucene.472066.n3.nabble.com/6-6-2-Release-tp4358534p4358712.html




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11816) Rolling restarts (rf>1) cause latency spike

2018-01-03 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-11816:
---

 Summary: Rolling restarts (rf>1) cause latency spike
 Key: SOLR-11816
 URL: https://issues.apache.org/jira/browse/SOLR-11816
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.6.2, 7.2
Reporter: Ishan Chattopadhyaya


During rolling restarts, even when replication factor >=2 (maxShardsPerNode=1), 
there's a latency spike.

I've seen many of my clients being hit by this issue. In some cases during 
rolling restarts, the shards get into a leaderless state (which I'm still 
reproducing).

I've put together a reproduction suite using docker, influxdb and grafana.
https://github.com/chatman/solr-grafana-docker/tree/rolling-restart-test

Steps:
# Run {{docker-compose up}}
# Open Grafana at http://localhost:3000 (user: admin/pw: admin)
# Run indexing.sh and querying.sh both in separate terminals
# Let the graphs build up a bit, and then run rolling-restarts.sh
# A latency spike (about 4x) is observed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 2657 matches

Mail list logo