Re: Why does Solr sort on _docid_ with rows=0 ?

2020-02-28 Thread S G
So no one knows this then?
It seems like a good opportunity to get some performance!

On Tue, Feb 25, 2020 at 2:01 PM S G  wrote:

> Hi,
>
> I see a lot of such queries in my Solr 7.6.0 logs:
>
>
> *path=/select
> params={q=*:*=false=_docid_+asc=0=javabin=2}
> hits=287128180 status=0 QTime=7173*
> On some searching, this is the code seems to fire the above:
>
> https://github.com/apache/lucene-solr/blob/f80e8e11672d31c6e12069d2bd12a28b92e5a336/solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java#L89-L101
>
> Can someone explain why Solr is doing this?
> Note that "hits" is a very large value and is something which could be
> impacting performance?
>
> If you want to check a zombie server, shouldn't there be a much less
> expensive way to do a health-check instead?
>
> Thanks
> SG
>
>
>
>


Re: Lucene/Solr 8.0

2018-12-18 Thread S G
It would be nice to see Solr 8 in January soon as there is an enhancement
on nested-documents we are waiting to get our hands on.
Any idea when Solr 8 would be out ?

Thx
SG

On Mon, Dec 17, 2018 at 1:34 PM David Smiley 
wrote:

> I see 10 JIRA issues matching this filter:   project in (SOLR, LUCENE) AND
> priority = Blocker and status = open and fixVersion = "master (8.0)"
>click here:
>
> https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR%2C%20LUCENE)%20AND%20priority%20%3D%20Blocker%20and%20status%20%3D%20open%20and%20fixVersion%20%3D%20%22master%20(8.0)%22%20
>
> Thru the end of the month, I intend to work on those issues not yet
> assigned.
>
> On Mon, Dec 17, 2018 at 4:51 AM Adrien Grand  wrote:
>
>> +1
>>
>> On Mon, Dec 17, 2018 at 10:38 AM Alan Woodward 
>> wrote:
>> >
>> > Hi all,
>> >
>> > Now that 7.6 is out of the door (thanks Nick!) we should think about
>> cutting the 8.0 branch and moving master to 9.0.  I’ll volunteer to create
>> the branch this week - say Wednesday?  Then we should have some time to
>> clean up the master branch and uncover anything that still needs to be done
>> on 8.0 before we start the release process next year.
>> >
>> > On 22 Oct 2018, at 18:12, Cassandra Targett 
>> wrote:
>> >
>> > I'm a bit delayed, but +1 on the 7.6 and 8.0 plan from me too.
>> >
>> > On Fri, Oct 19, 2018 at 7:18 AM Erick Erickson 
>> wrote:
>> >>
>> >> +1, this gives us all a chance to prioritize getting the blockers out
>> >> of the way in a careful manner.
>> >> On Fri, Oct 19, 2018 at 7:56 AM jim ferenczi 
>> wrote:
>> >> >
>> >> > +1 too. With this new perspective we could create the branch just
>> after the 7.6 release and target the 8.0 release for January 2019 which
>> gives almost 3 month to finish the blockers ?
>> >> >
>> >> > Le jeu. 18 oct. 2018 à 23:56, David Smiley 
>> a écrit :
>> >> >>
>> >> >> +1 to a 7.6 —lots of stuff in there
>> >> >> On Thu, Oct 18, 2018 at 4:47 PM Nicholas Knize 
>> wrote:
>> >> >>>
>> >> >>> If we're planning to postpone cutting an 8.0 branch until a few
>> weeks from now then I'd like to propose (and volunteer to RM) a 7.6 release
>> targeted for late November or early December (following the typical 2 month
>> release pattern). It feels like this might give a little breathing room for
>> finishing up 8.0 blockers? And looking at the change log there appear to be
>> a healthy list of features, bug fixes, and improvements to both Solr and
>> Lucene that warrant a 7.6 release? Personally I wouldn't mind releasing the
>> LatLonShape encoding changes in LUCENE-8521 and selective indexing work
>> done in LUCENE-8496. Any objections or thoughts?
>> >> >>>
>> >> >>> - Nick
>> >> >>>
>> >> >>>
>> >> >>> On Thu, Oct 18, 2018 at 5:32 AM Đạt Cao Mạnh <
>> caomanhdat...@gmail.com> wrote:
>> >> 
>> >>  Thanks Cassandra and Jim,
>> >> 
>> >>  I created a blocker issue for Solr 8.0 SOLR-12883, currently in
>> jira/http2 branch there are a draft-unmature implementation of SPNEGO
>> authentication which enough to makes the test pass, this implementation
>> will be removed when SOLR-12883 gets resolved . Therefore I don't see any
>> problem on merging jira/http2 to master branch in the next week.
>> >> 
>> >>  On Thu, Oct 18, 2018 at 2:33 AM jim ferenczi <
>> jim.feren...@gmail.com> wrote:
>> >> >
>> >> > > But if you're working with a different assumption - that just
>> the existence of the branch does not stop Dat from still merging his work
>> and the work being included in 8.0 - then I agree, waiting for him to merge
>> doesn't need to stop the creation of the branch.
>> >> >
>> >> > Yes that's my reasoning. This issue is a blocker so we won't
>> release without it but we can work on the branch in the meantime and let
>> other people work on new features that are not targeted to 8.
>> >> >
>> >> > Le mer. 17 oct. 2018 à 20:51, Cassandra Targett <
>> casstarg...@gmail.com> a écrit :
>> >> >>
>> >> >> OK - I was making an assumption that the timeline for the first
>> 8.0 RC would be ASAP after the branch is created.
>> >> >>
>> >> >> It's a common perception that making a branch freezes adding
>> new features to the release, perhaps in an unofficial way (more of a
>> courtesy rather than a rule). But if you're working with a different
>> assumption - that just the existence of the branch does not stop Dat from
>> still merging his work and the work being included in 8.0 - then I agree,
>> waiting for him to merge doesn't need to stop the creation of the branch.
>> >> >>
>> >> >> If, however, once the branch is there people object to Dat
>> merging his work because it's "too late", then the branch shouldn't be
>> created yet because we want to really try to clear that blocker for 8.0.
>> >> >>
>> >> >> Cassandra
>> >> >>
>> >> >> On Wed, Oct 17, 2018 at 12:13 PM jim ferenczi <
>> jim.feren...@gmail.com> wrote:
>> >> >>>
>> >> >>> Ok thanks for answering.
>> 

Re: BugFix release 7.2.1

2018-01-08 Thread S G
Sorry, I missed some of the details but this is what we did in one of my
past projects with success:

We can begin by supporting only those machines where Apache Solr's
regression tests are run.
The aim is to identify OS-independent performance regressions, not to
certify each OS where Solr could be run.

Repository wise is easy too - We store the results in a performance-results
directory that stays in the github repo of Apache Solr.
This directory will receive metric-result-file(s) whenever a Solr release
is made.
And if older files are present, then the last metric file will be used to
compare the current performance.
When not making a release, the directory can be used to compare current
code's performance without writing to the performance-results directory.
When releasing Solr, the performance-metrics file should get updated
automatically.

Further improvements can include:
1) Deleting older files from performance-results directory
2) Having performance-results directories for each OS where Solr is
released (if we think OS-dependent performance issues could be there).

These ideas can be fine-tuned to ensure that they work.
Please suggest more issues if you think this would be impractical.

Thanks
SG




On Mon, Jan 8, 2018 at 12:59 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Hmmm, I think you missed my implied point. How are these metrics collected
> and compared? There are about a dozen different machines running various op
> systems etc. For these measurements to spot regressions and/or
> improvements, they need to have a repository where the results get
> published. So a report like "build XXX took YYY seconds to index ZZZ
> documents" doesn't tell us anything. You need to gather then for a
> _specific_ machine.
>
> As for whether they should be run or not, an annotation could help here,
> there are already @Slow, @Nightly, @Weekly and @Performance could be added.
> Mike McCandless has some of these kinds of things already for Lucene, I
> htink the first thing would be to check whether they are already done, it's
> possible you'd be reinventing the wheel.
>
> Best,
> Erick
>
> On Mon, Jan 8, 2018 at 11:45 AM, S G <sg.online.em...@gmail.com> wrote:
>
>> We can put some lower limits on CPU and Memory for running a performance
>> test.
>> If those lower limits are not met, then the test will just skip execution.
>>
>> And then we put some lower bounds (time-wise) on the time spent by
>> different parts of the test like:
>>  - Max time taken to index 1 million documents
>>  - Max time taken to query, facet, pivot etc
>>  - Max time taken to delete 100,000 documents while read and writes are
>> happening.
>>
>> For all of the above, we can publish metrics like 5minRate, 95thPercent
>> and assert on values lower than a particular value.
>>
>> I know some other software compare CPU cycles across different runs as
>> well but not sure how.
>>
>> Such tests will give us more confidence when releasing/adopting new
>> features like pint compared to tint etc.
>>
>> Thanks
>> SG
>>
>>
>>
>> On Sat, Jan 6, 2018 at 9:59 AM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>
>>> Not sure how performance tests in the unit tests would be interpreted.
>>> If I run the same suite on two different machines how do I compare the
>>> numbers?
>>>
>>> Or are you thinking of having some tests so someone can check out
>>> different versions of Solr and run the perf tests on a single machine,
>>> perhaps using bisect to pinpoint when something changed?
>>>
>>> I'm not opposed at all, just trying to understand how one would go about
>>> using such tests.
>>>
>>> Best,
>>> Erick
>>>
>>> On Fri, Jan 5, 2018 at 10:09 PM, S G <sg.online.em...@gmail.com> wrote:
>>>
>>>> Just curious to know, does the test suite include some performance test
>>>> also?
>>>> I would like to know the performance impact of using pints vs tints or
>>>> ints etc.
>>>> If they are not there, I can try to add some tests for the same.
>>>>
>>>> Thanks
>>>> SG
>>>>
>>>>
>>>> On Fri, Jan 5, 2018 at 5:47 PM, Đạt Cao Mạnh <caomanhdat...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I will work on SOLR-11771
>>>>> <https://issues.apache.org/jira/browse/SOLR-11771> today, It is a
>>>>> simple fix and will be great if it get fixed in 7.2.1
>>>>>
>>>>> On Fri, Jan 5,

Re: BugFix release 7.2.1

2018-01-08 Thread S G
We can put some lower limits on CPU and Memory for running a performance
test.
If those lower limits are not met, then the test will just skip execution.

And then we put some lower bounds (time-wise) on the time spent by
different parts of the test like:
 - Max time taken to index 1 million documents
 - Max time taken to query, facet, pivot etc
 - Max time taken to delete 100,000 documents while read and writes are
happening.

For all of the above, we can publish metrics like 5minRate, 95thPercent and
assert on values lower than a particular value.

I know some other software compare CPU cycles across different runs as well
but not sure how.

Such tests will give us more confidence when releasing/adopting new
features like pint compared to tint etc.

Thanks
SG



On Sat, Jan 6, 2018 at 9:59 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Not sure how performance tests in the unit tests would be interpreted. If
> I run the same suite on two different machines how do I compare the
> numbers?
>
> Or are you thinking of having some tests so someone can check out
> different versions of Solr and run the perf tests on a single machine,
> perhaps using bisect to pinpoint when something changed?
>
> I'm not opposed at all, just trying to understand how one would go about
> using such tests.
>
> Best,
> Erick
>
> On Fri, Jan 5, 2018 at 10:09 PM, S G <sg.online.em...@gmail.com> wrote:
>
>> Just curious to know, does the test suite include some performance test
>> also?
>> I would like to know the performance impact of using pints vs tints or
>> ints etc.
>> If they are not there, I can try to add some tests for the same.
>>
>> Thanks
>> SG
>>
>>
>> On Fri, Jan 5, 2018 at 5:47 PM, Đạt Cao Mạnh <caomanhdat...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I will work on SOLR-11771
>>> <https://issues.apache.org/jira/browse/SOLR-11771> today, It is a
>>> simple fix and will be great if it get fixed in 7.2.1
>>>
>>> On Fri, Jan 5, 2018 at 11:23 PM Erick Erickson <erickerick...@gmail.com>
>>> wrote:
>>>
>>>> Neither of those Solr fixes are earth shatteringly important, they've
>>>> both been around for quite a while. I don't think it's urgent to include
>>>> them.
>>>>
>>>> That said, they're pretty simple and isolated so worth doing if Jim is
>>>> willing. But not worth straining much. I was just clearing out some backlog
>>>> over vacation.
>>>>
>>>> Strictly up to you Jim.
>>>>
>>>> Erick
>>>>
>>>> On Fri, Jan 5, 2018 at 6:54 AM, David Smiley <david.w.smi...@gmail.com>
>>>> wrote:
>>>>
>>>>> https://issues.apache.org/jira/browse/SOLR-11809 is in progress,
>>>>> should be easy and I think definitely worth backporting
>>>>>
>>>>> On Fri, Jan 5, 2018 at 8:52 AM Adrien Grand <jpou...@gmail.com> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> Looking at the changelog, 7.3 has 3 bug fixes for now: LUCENE-8077,
>>>>>> SOLR-11783 and SOLR-11555. The Lucene change doesn't seem worth
>>>>>> backporting, but maybe the Solr changes should?
>>>>>>
>>>>>> Le ven. 5 janv. 2018 à 12:40, jim ferenczi <jim.feren...@gmail.com>
>>>>>> a écrit :
>>>>>>
>>>>>>> Hi,
>>>>>>> We discovered a bad bug in 7x that affects indices created in 6x
>>>>>>> with Lucene54DocValues format. The SortedNumericDocValues created with 
>>>>>>> this
>>>>>>> format have a bug when advanceExact is used, the values retrieved for 
>>>>>>> the
>>>>>>> docs when advanceExact returns true are invalid (the pointer to the 
>>>>>>> values
>>>>>>> is not updated):
>>>>>>> https://issues.apache.org/jira/browse/LUCENE-8117
>>>>>>> This affects all indices created in 6x with sorted numeric doc
>>>>>>> values so I wanted to ask if anyone objects to a bugfix release for 7.2
>>>>>>> (7.2.1). I also volunteer to be the release manager for this one if it 
>>>>>>> is
>>>>>>> accepted.
>>>>>>>
>>>>>>> Jim
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>>>> http://www.solrenterprisesearchserver.com
>>>>>
>>>>
>>>>
>>
>


Re: BugFix release 7.2.1

2018-01-05 Thread S G
Just curious to know, does the test suite include some performance test
also?
I would like to know the performance impact of using pints vs tints or ints
etc.
If they are not there, I can try to add some tests for the same.

Thanks
SG


On Fri, Jan 5, 2018 at 5:47 PM, Đạt Cao Mạnh 
wrote:

> Hi all,
>
> I will work on SOLR-11771
>  today, It is a simple
> fix and will be great if it get fixed in 7.2.1
>
> On Fri, Jan 5, 2018 at 11:23 PM Erick Erickson 
> wrote:
>
>> Neither of those Solr fixes are earth shatteringly important, they've
>> both been around for quite a while. I don't think it's urgent to include
>> them.
>>
>> That said, they're pretty simple and isolated so worth doing if Jim is
>> willing. But not worth straining much. I was just clearing out some backlog
>> over vacation.
>>
>> Strictly up to you Jim.
>>
>> Erick
>>
>> On Fri, Jan 5, 2018 at 6:54 AM, David Smiley 
>> wrote:
>>
>>> https://issues.apache.org/jira/browse/SOLR-11809 is in progress, should
>>> be easy and I think definitely worth backporting
>>>
>>> On Fri, Jan 5, 2018 at 8:52 AM Adrien Grand  wrote:
>>>
 +1

 Looking at the changelog, 7.3 has 3 bug fixes for now: LUCENE-8077,
 SOLR-11783 and SOLR-11555. The Lucene change doesn't seem worth
 backporting, but maybe the Solr changes should?

 Le ven. 5 janv. 2018 à 12:40, jim ferenczi  a
 écrit :

> Hi,
> We discovered a bad bug in 7x that affects indices created in 6x with
> Lucene54DocValues format. The SortedNumericDocValues created with this
> format have a bug when advanceExact is used, the values retrieved for the
> docs when advanceExact returns true are invalid (the pointer to the values
> is not updated):
> https://issues.apache.org/jira/browse/LUCENE-8117
> This affects all indices created in 6x with sorted numeric doc values
> so I wanted to ask if anyone objects to a bugfix release for 7.2 (7.2.1). 
> I
> also volunteer to be the release manager for this one if it is accepted.
>
> Jim
>

>>>
>>> --
>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
>>> solrenterprisesearchserver.com
>>>
>>
>>


Re: SolrException: Error trying to proxy request for url: solr/sync-status/admin/system

2017-06-20 Thread S G
Got no response on the solr-user mailing list and so trying the dev-mailing
list.

Please guide me if this should not be done. But I thought that the issue
looks strange enough to post it here.

Thanks
SG


On Mon, Jun 19, 2017 at 8:13 PM, S G <sg.online.em...@gmail.com> wrote:

> Hi,
>
> We are stuck in a strange problem.
> Whole cluster is red. All nodes are being shown as down.
> Restart of the nodes is not helping either.
>
> All our nodes seem to have gone into a distributed lock.
> Here is the grep command I ran on all the solr.log files:
>
> grep "Error trying to proxy request" $f | cut -d" " -f14 | sort | uniq
> -c
> And the output from 10 different solr-nodes' solr.log file is shown below:
> (Basically each node is calling admin/system on other nodes and throwing
> exceptions. You can see number of exceptions thrown by each server for
> every other server).
>
>
>
> SVR_1.log
>   13 http://SVR_2:8983/solr/my-collection/admin/system
>   18 http://SVR_3:8983/solr/my-collection/admin/system
>   19 http://SVR_4:8983/solr/my-collection/admin/system
>   15 http://SVR_6:8983/solr/my-collection/admin/system
>   13 http://SVR_7:8983/solr/my-collection/admin/system
>   13 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_2.log
>  335 http://SVR_3:8983/solr/my-collection/admin/system
>   23 http://SVR_4:8983/solr/my-collection/admin/system
>   21 http://SVR_6:8983/solr/my-collection/admin/system
>   23 http://SVR_7:8983/solr/my-collection/admin/system
>   23 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_3.log
>   24 http://SVR_2:8983/solr/my-collection/admin/system
>   14 http://SVR_4:8983/solr/my-collection/admin/system
>   13 http://SVR_6:8983/solr/my-collection/admin/system
>   14 http://SVR_7:8983/solr/my-collection/admin/system
>   16 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_4.log
>   11 http://SVR_2:8983/solr/my-collection/admin/system
>   29 http://SVR_3:8983/solr/my-collection/admin/system
>7 http://SVR_6:8983/solr/my-collection/admin/system
>   16 http://SVR_7:8983/solr/my-collection/admin/system
>   11 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_5.log
>   18 http://SVR_2:8983/solr/my-collection/admin/system
>   16 http://SVR_3:8983/solr/my-collection/admin/system
>   13 http://SVR_4:8983/solr/my-collection/admin/system
>   12 http://SVR_6:8983/solr/my-collection/admin/system
>   16 http://SVR_7:8983/solr/my-collection/admin/system
>   11 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_6.log
>   44 http://SVR_2:8983/solr/my-collection/admin/system
>  296 http://SVR_3:8983/solr/my-collection/admin/system
>   40 http://SVR_4:8983/solr/my-collection/admin/system
>   15 http://SVR_7:8983/solr/my-collection/admin/system
>   15 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_7.log
>   59 http://SVR_2:8983/solr/my-collection/admin/system
>  215 http://SVR_3:8983/solr/my-collection/admin/system
>   62 http://SVR_4:8983/solr/my-collection/admin/system
>   47 http://SVR_6:8983/solr/my-collection/admin/system
>   61 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_8.log
>   13 http://SVR_2:8983/solr/my-collection/admin/system
>   18 http://SVR_3:8983/solr/my-collection/admin/system
>   10 http://SVR_4:8983/solr/my-collection/admin/system
>7 http://SVR_6:8983/solr/my-collection/admin/system
>   12 http://SVR_7:8983/solr/my-collection/admin/system
>   13 http://SVR_9:8983/solr/my-collection/admin/system
>
> SVR_9.log
>   38 http://SVR_2:8983/solr/my-collection/admin/system
>  229 http://SVR_3:8983/solr/my-collection/admin/system
>   15 http://SVR_4:8983/solr/my-collection/admin/system
>   22 http://SVR_6:8983/solr/my-collection/admin/system
>   26 http://SVR_7:8983/solr/my-collection/admin/system
>
> SVR_10.log
>9 http://SVR_2:8983/solr/my-collection/admin/system
>   22 http://SVR_3:8983/solr/my-collection/admin/system
>   18 http://SVR_4:8983/solr/my-collection/admin/system
>   14 http://SVR_6:8983/solr/my-collection/admin/system
>   18 http://SVR_7:8983/solr/my-collection/admin/system
>   10 http://SVR_9:8983/solr/my-collection/admin/system
>
>
> Thanks
> SG
>


Re: Better error message? (old version and new version are not comparable)

2017-06-12 Thread S G
Thanks Tomas,
I have read the contribution docs and also created a JIRA at
https://issues.apache.org/jira/browse/SOLR-10877
New PR is at https://github.com/apache/lucene-solr/pull/213
Please review the same.

-SG


On Mon, Jun 12, 2017 at 10:28 AM, Tomas Fernandez Lobbe <tflo...@apple.com>
wrote:

> Hi SG,
> You should create a Jira issue, then if you can rename your PR to include
> the Jira issue code in the title (that’s the way to link the Jira issue to
> the Github PR). Also, take a look at https://wiki.apache.org/
> solr/HowToContribute.
>
> Tomás
>
> On Jun 12, 2017, at 9:14 AM, S G <sg.online.em...@gmail.com> wrote:
>
> Hi Zheng,
>
> We are using 6.3 version.
> I have created a small PR that addresses this issue of better error
> message.
> Please merge the same if it looks ok.
> https://github.com/apache/lucene-solr/pull/212/files
>
> Thanks
> SG
>
>
> On Sun, Jun 11, 2017 at 8:45 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com
> > wrote:
>
>> Which version of Solr are yo using?
>>
>> Also, I believe you should send this message to the user list at
>> solr-u...@lucene.apache.org
>>
>> Regards,
>> Edwin
>>
>> On 12 June 2017 at 02:45, S G <sg.online.em...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Recently I ran into an issue where the logs were showing the following:
>>>
>>>
>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>> from server at http://solr-host:8983/solr/loadtest_shard1_replica1: old
>>> version and new version are not comparable: class java.lang.Long vs class
>>> java.lang.Integer: null
>>> at 
>>> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:765)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1173)
>>> ~[load-client.jar]
>>> at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>> hRetryOnStaleState(CloudSolrClient.java:1062) ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1004)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
>>> ~[load-client.jar]
>>> at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
>>> ~[load-client.jar]
>>> at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:123)
>>> ~[load-client.jar]
>>> at WriteLoadTest$IndexWorker.addDocsToSolr(WriteLoadTest.java:629)
>>> [load-client.jar]
>>> at WriteLoadTest$IndexWorker.run(WriteLoadTest.java:573)
>>> [load-client.jar]
>>> Caused by: 
>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>> Error from server at http://solr-host:8983/solr/loadtest_shard1_replica1:
>>> old version and new version are not comparable: class java.lang.Long vs
>>> class java.lang.Integer: null
>>> at 
>>> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:435)
>>> ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:387)
>>> ~[load-client.jar]
>>> at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>> ectUpdate$0(CloudSolrClient.java:742) ~[load-client.jar]
>>> at 
>>> org.apache.solr.client.solrj.impl.CloudSolrClient$$Lambda$8/745780984.call(Unknown
>>> Source) ~[?:?]
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>> ~[?:1.8.0_51]
>>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>> xecutor.lambda$execute$0(ExecutorUtil.java:229) ~[load-client.jar]
>>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>> xecutor$$Lambda$7/725894523.run(Unknown Source) ~[?:?]
>>> ... 3 more
>>>
>>>
>>> The message above is not clear at all.
>>> Which document is it talking about?
>>> I have so many documents being ingested and it's hard to figure out the
>>> same.
>>> It would have been nice if the message included a document ID too.
>>>
>>>
>>> Thanks
>>> SG
>>>
>>>
>>>
>>>
>>
>
>


Re: Better error message? (old version and new version are not comparable)

2017-06-12 Thread S G
Hi Zheng,

We are using 6.3 version.
I have created a small PR that addresses this issue of better error message.
Please merge the same if it looks ok.
https://github.com/apache/lucene-solr/pull/212/files

Thanks
SG


On Sun, Jun 11, 2017 at 8:45 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com>
wrote:

> Which version of Solr are yo using?
>
> Also, I believe you should send this message to the user list at solr-user
> @lucene.apache.org
>
> Regards,
> Edwin
>
> On 12 June 2017 at 02:45, S G <sg.online.em...@gmail.com> wrote:
>
>> Hi,
>>
>> Recently I ran into an issue where the logs were showing the following:
>>
>>
>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>> from server at http://solr-host:8983/solr/loadtest_shard1_replica1: old
>> version and new version are not comparable: class java.lang.Long vs class
>> java.lang.Integer: null
>> at 
>> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:765)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1173)
>> ~[load-client.jar]
>> at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>> hRetryOnStaleState(CloudSolrClient.java:1062) ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1004)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
>> ~[load-client.jar]
>> at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
>> ~[load-client.jar]
>> at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:123)
>> ~[load-client.jar]
>> at WriteLoadTest$IndexWorker.addDocsToSolr(WriteLoadTest.java:629)
>> [load-client.jar]
>> at WriteLoadTest$IndexWorker.run(WriteLoadTest.java:573)
>> [load-client.jar]
>> Caused by: 
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://solr-host:8983/solr/loadtest_shard1_replica1:
>> old version and new version are not comparable: class java.lang.Long vs
>> class java.lang.Integer: null
>> at 
>> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:435)
>> ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:387)
>> ~[load-client.jar]
>> at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>> ectUpdate$0(CloudSolrClient.java:742) ~[load-client.jar]
>> at 
>> org.apache.solr.client.solrj.impl.CloudSolrClient$$Lambda$8/745780984.call(Unknown
>> Source) ~[?:?]
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> ~[?:1.8.0_51]
>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>> xecutor.lambda$execute$0(ExecutorUtil.java:229) ~[load-client.jar]
>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>> xecutor$$Lambda$7/725894523.run(Unknown Source) ~[?:?]
>> ... 3 more
>>
>>
>> The message above is not clear at all.
>> Which document is it talking about?
>> I have so many documents being ingested and it's hard to figure out the
>> same.
>> It would have been nice if the message included a document ID too.
>>
>>
>> Thanks
>> SG
>>
>>
>>
>>
>


Better error message? (old version and new version are not comparable)

2017-06-11 Thread S G
Hi,

Recently I ran into an issue where the logs were showing the following:


org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://solr-host:8983/solr/loadtest_shard1_replica1: old
version and new version are not comparable: class java.lang.Long vs class
java.lang.Integer: null
at
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:765)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1173)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1062)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1004)
~[load-client.jar]
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
~[load-client.jar]
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
~[load-client.jar]
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:123)
~[load-client.jar]
at WriteLoadTest$IndexWorker.addDocsToSolr(WriteLoadTest.java:629)
[load-client.jar]
at WriteLoadTest$IndexWorker.run(WriteLoadTest.java:573)
[load-client.jar]
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://solr-host:8983/solr/loadtest_shard1_replica1: old
version and new version are not comparable: class java.lang.Long vs class
java.lang.Integer: null
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:435)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:387)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:742)
~[load-client.jar]
at
org.apache.solr.client.solrj.impl.CloudSolrClient$$Lambda$8/745780984.call(Unknown
Source) ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_51]
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
~[load-client.jar]
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$7/725894523.run(Unknown
Source) ~[?:?]
... 3 more


The message above is not clear at all.
Which document is it talking about?
I have so many documents being ingested and it's hard to figure out the
same.
It would have been nice if the message included a document ID too.


Thanks
SG


Re: Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)

2017-06-07 Thread S G
Solr nodes were provisioned just 2 weeks back and it was a brand new Solr
cluster.
The nodes always had 6.3 indexes for the past few days but for this very
test, we had created a brand new collection.

We were using data-driven schema and our theory is that one of the shard
guessed some field to be as long while the other shard guessed the same
field to be as integer.

If that is true, then its a pretty bad problem IMO which is difficult to
reproduce (because each shard should simultaneously guess the type of the
same field to be different). Also this is a problem that may not show up in
several test-runs but may show up directly in production because it depends
on race conditions between the shards.

And it still does not answer why the Solr UI is becoming unresponsive. Why
is the thread running Solr UI getting blocked due to any low-level problems?


On Tue, Jun 6, 2017 at 8:58 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Uwe just posted a detailed explanation on that jira. Note in particular
> that you must delete the index from disk to be certain all remnants of the
> old metadata are gone if you change field definitions or you can get this
> error. I generally either delete the collection or create a new one when
> changing the schema.
>
> On Jun 6, 2017 8:19 AM, "Varun Thacker" <va...@vthacker.in> wrote:
>
> Does this happen on a fresh Solr 6.3 ( as mentioned on SOLR-10806 ) or was
> the index existing with some other version and then upgraded to 6.3 ?
>
> Is the problem reproducible for you?
>
>
> On Tue, Jun 6, 2017 at 7:26 AM, S G <sg.online.em...@gmail.com> wrote:
>
>> Hi,
>>
>> We are seeing some very bad performance on our performance test that
>> tries to load a 2 shard, 3 replica system with about 2000 writes/sec and
>> 2000 reads/sec
>>
>> The exception stack trace seems to point to a specific line of code and a
>> similar stack trace is reported by users on Elastic-Search forums too.
>>
>> Could this be a a common bug in Lucene which is affecting both the
>> systems?
>> https://issues.apache.org/jira/browse/SOLR-10806
>>
>> One bad part about Solr is that once it happens, the whole system comes
>> to a grinding halt.
>> Solr UI is not accessible, even for the nodes not hosting any collections
>> !
>> It would be really nice to get rid of such an instability in the system.
>>
>> Thanks
>> SG
>>
>>
>>
>
>


Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)

2017-06-06 Thread S G
Hi,

We are seeing some very bad performance on our performance test that tries
to load a 2 shard, 3 replica system with about 2000 writes/sec and 2000
reads/sec

The exception stack trace seems to point to a specific line of code and a
similar stack trace is reported by users on Elastic-Search forums too.

Could this be a a common bug in Lucene which is affecting both the systems?
https://issues.apache.org/jira/browse/SOLR-10806

One bad part about Solr is that once it happens, the whole system comes to
a grinding halt.
Solr UI is not accessible, even for the nodes not hosting any collections !
It would be really nice to get rid of such an instability in the system.

Thanks
SG


Backing up the indexes to a HDFS filesystem

2017-05-15 Thread S G
Hi,

I have a question regarding the documentation at
https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups
for backing up the indexes to a HDFS filesystem

1) How frequently are the indexes backed up?
2) Is there a possibility of data-loss if Solr crashes between two backups?
3) Is it production ready?
4) What is the performance impact of backup?
5) How quick are the restores? (i.e some benchmarking of time vs index size)


Thanks
SG


Re: 6.4 release

2017-01-14 Thread S G
Created https://issues.apache.org/jira/browse/SOLR-9967 as per Mark's
request.

On Sat, Jan 14, 2017 at 7:02 AM, jim ferenczi 
wrote:

> Thanks Uwe !
>
> 2017-01-14 16:01 GMT+01:00 Uwe Schindler :
>
>> Hi Jim,
>>
>> Thanks! I will update Jenkins this evening (Policeman and ASF).
>>
>> The first successful Java 9 build 152 was executed already on branch-6x,
>> so we are in good shape also on Java 9 front. 
>>
>> Uwe
>>
>>
>> Am 14. Januar 2017 15:57:13 MEZ schrieb jim ferenczi <
>> jim.feren...@gmail.com>:
>>>
>>> Hi,
>>>
>>> The release branch for 6.4 is pushed so the feature freeze phase has
>>> officially started.
>>> I don't have an admin account on Jenkins so any help would be
>>> appreciated. We need to copy the job for the new branch.
>>>
>>> No new features may be committed to the branch.
>>> Documentation patches, build patches and serious bug fixes may be
>>> committed to the branch. However, you should submit all patches you want to
>>> commit to Jira first to give others the chance to review and possibly vote
>>> against the patch. Keep in mind that it is our main intention to keep the
>>> branch as stable as possible.
>>> All patches that are intended for the branch should first be committed
>>> to the unstable branch, merged into the stable branch, and then into the
>>> current release branch.
>>> Normal unstable and stable branch development may continue as usual.
>>> However, if you plan to commit a big change to the unstable branch while
>>> the branch feature freeze is in effect, think twice: can't the addition
>>> wait a couple more days? Merges of bug fixes into the branch may become
>>> more difficult.
>>> Only Jira issues with Fix version "6.4" and priority "Blocker" will
>>> delay a release candidate build.
>>>
>>> Thanks,
>>> Jim
>>>
>>
>> --
>> Uwe Schindler
>> Achterdiek 19, 28357 Bremen
>> https://www.thetaphi.de
>>
>
>


Re: 6.4 release

2017-01-13 Thread S G
Probably too late for this release, but can we consider using Zookeeper
3.4.9 for the next
release as it brings in a lot of stability improvements (
http://zookeeper.apache.org/releases.html) and
Solr guide is still recommending zookeeper 3.4.6 (
https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble
)

On Fri, Jan 13, 2017 at 11:46 AM, jim ferenczi 
wrote:

> Great news Uwe! I'll cut the branch this week end if nobody disagree.
>
> Le 13 janv. 2017 20:24, "Uwe Schindler"  a écrit :
>
>> Hi Jim,
>>
>>
>>
>> I updated Groovy. I’d like to run Solr tests this evening through Jenkins
>> with Java 9 EA build 151 to get a list of all tests that have to be
>> disabled for Java 9, but otherwise we are ready.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> *From:* jim ferenczi [mailto:jim.feren...@gmail.com]
>> *Sent:* Monday, January 9, 2017 12:37 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: 6.4 release
>>
>>
>>
>> Thanks for sharing Uwe. It would be wonderful to make latest java9 works
>> with the released version. I can delay the branching until
>> https://issues.apache.org/jira/browse/LUCENE-7596 is closed ?
>>
>> In the mean time I am trying to create the release notes in the wiki but
>> I don't have the permissions to create the pages. Where can I get such
>> permissions ?
>>
>>
>>
>> 2017-01-09 11:13 GMT+01:00 Uwe Schindler :
>>
>> Hi,
>>
>>
>>
>> I am fine with the start of release process, but I would like to add one
>> thing:
>>
>>
>>
>> I know that Elasticsearch wants to be compatible with recent Java 9 for
>> the continuous delievery process. The new Lucene release is compatible with
>> that (mmap unmapping works), but you cannot build the release with Java 9
>> 148+. Currently the release vote of Groovy 2.4.8 is ongoing and should
>> finish the next few days. So I’d suggest to delay a bit, so I can update
>> common-build.xml and raise the Groovy version:
>> https://issues.apache.org/jira/browse/LUCENE-7596
>>
>>
>>
>> This does not make Solr Tests work with Java 9, but that’s a different
>> discussion (mocking frameworks broke with recent Java 9):
>> https://issues.apache.org/jira/browse/SOLR-9893
>>
>> I will fix this by disabling those tests with Java 9, but that a bit of
>> work to set those assumeFalse(Constants.JAVA9….). I don’t see this as a
>> blocker!
>>
>>
>>
>> Uwe
>>
>>
>>
>> -
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> *From:* jim ferenczi [mailto:jim.feren...@gmail.com]
>> *Sent:* Tuesday, January 3, 2017 5:23 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* 6.4 release
>>
>>
>>
>> Hi,
>> I would like to volunteer to release 6.4. I can cut the release branch
>> next Monday if everybody agrees.
>>
>>
>>
>> Jim
>>
>>
>>
>


Re: 6.4 release

2017-01-04 Thread S G
+1 for adding the metric related changes.
Aggregated metrics from from replicas sounds like a very nice thing to have.

On Wed, Jan 4, 2017 at 12:11 PM, Varun Thacker  wrote:

> +1 to cut a release branch on monday. Lots of goodies in this release!
>
> On Tue, Jan 3, 2017 at 8:23 AM, jim ferenczi 
> wrote:
>
>> Hi,
>> I would like to volunteer to release 6.4. I can cut the release branch
>> next Monday if everybody agrees.
>>
>> Jim
>>
>
>


How to identify documents failed in a batch request?

2016-12-17 Thread S G
Hi,

I am using the following code to send documents to Solr:

final UpdateRequest request = new UpdateRequest();
request.setAction(UpdateRequest.ACTION.COMMIT, false, false);
request.add(docsList);
UpdateResponse response = request.process(solrClient);

The response returned from the last line does not seem to be very helpful
in determining how I can identify documents failed in a batch request.

Does anyone know how this can be done?

Thanks
SG


Memory leak in Solr

2016-12-02 Thread S G
Hi,

This post shows some stats on Solr which indicate that there might be a
memory leak in there.

http://stackoverflow.com/questions/40939166/is-this-a-memory-leak-in-solr

Can someone please help to debug this?
It might be a very good step in making Solr stable if we can fix this.

Thanks
SG


Solr and tunable consistency?

2016-02-03 Thread S G
Hi,

Does someone know if Solr has a tunable consistency like the one found in
Cassandra or ElasticSearch

 ?

There is a replication_factor
 in Solr Cloud but it does
not seem to be providing tunable consistency.

Is there any other way to do the same in Solr?

Or any plans to add the same?


Thanks
SG


Re: Review needed for SOLR-7121

2015-04-30 Thread S G
It's been a month since I updated my pull request with all the test-cases.

I would really appreciate if someone could review and merge the below pull
request.

This patch:
1) Makes the nodes more resilient to crashes,
2) Improves cloud stability and
3) Prevents distributed deadlocks.

Thanks
Sachin


On Tue, Mar 31, 2015 at 4:30 PM, S G sg.online.em...@gmail.com wrote:

 Hi,

 I have opened a pull request for
 https://issues.apache.org/jira/browse/SOLR-7121
 at https://github.com/apache/lucene-solr/pull/132


 This PR allows clients to specify some threshold values beyond which the
 targeted core can declare itself unhealthy and proactively go down to
 recover.
 When the load improves, the downed cores come up automatically.
 Such behavior will help machines survive longer by not hitting their
 hardware limits.

 The PR includes tests for all the ill-health cases.
 If someone can review this and help me get it committed, it would be much
 appreciated.

 Thanks
 Sachin





Review needed for SOLR-7121

2015-03-31 Thread S G
Hi,

I have opened a pull request for
https://issues.apache.org/jira/browse/SOLR-7121
at https://github.com/apache/lucene-solr/pull/132


This PR allows clients to specify some threshold values beyond which the
targeted core can declare itself unhealthy and proactively go down to
recover.
When the load improves, the downed cores come up automatically.
Such behavior will help machines survive longer by not hitting their
hardware limits.

The PR includes tests for all the ill-health cases.
If someone can review this and help me get it committed, it would be much
appreciated.

Thanks
Sachin


Ant output of solr tests

2015-02-24 Thread S G
Hi,

I want to print some information in a Solr test.
But I am unable to get it to print on console or file.

I have tried System.out.println and log.error but the output is not coming
anywhere.
log4j.properties also looks ok to me.

Command run:
ant test -Dtestcase=CloudSolrClientTest

Can someone help me point in the right direction?

Thanks
Sachin


Re: bin/solr not working in trunk

2015-02-17 Thread S G
Great.
This worked just perfect.

Thanks Shalin!
Sachin


On Tue, Feb 17, 2015 at 10:46 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Instead of ant dist, you should use ant server (earlier called ant
 example). I don't think the scripts ever worked on a source checkout
 without running ant example but I may be wrong.

 On Wed, Feb 18, 2015 at 12:11 AM, S G sg.online.em...@gmail.com wrote:

 Hi,

 I want to compile solr from the main trunk and run it.


 I did the following:

 git clone https://github.com/apache/lucene-solr.git
 cd lucene-solr/solr
 ant dist
 bin/solr -e cloud


 This creates the relevant solr nodes but fails to create a collection
 with the following error:

 $ bin/solr -e cloud

 Welcome to the SolrCloud example!


 This interactive session will help you launch a SolrCloud cluster on your 
 local workstation.

 To begin, how many Solr nodes would you like to run in your local cluster? 
 (specify 1-4 nodes) [2]
 Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.

 Please enter the port for node1 [8983]
 8983
 Please enter the port for node2 [7574]
 7574

 Starting up SolrCloud node1 on port 8983 using command:

 solr start -cloud -s example/cloud/node1/solr -p 8983


 Waiting to see Solr listening on port 8983 [|]
 Started Solr server on port 8983 (pid=94888). Happy searching!



 Starting node2 on port 7574 using command:

 solr start -cloud -s example/cloud/node2/solr -p 7574 -z localhost:9983


 Waiting to see Solr listening on port 7574 [|]
 Started Solr server on port 7574 (pid=94979). Happy searching!


 Now let's create a new collection for indexing documents in your 2-node 
 cluster.

 Please provide a name for your new collection: [gettingstarted]
 gettingstarted
 How many shards would you like to split gettingstarted into? [2]
 2
 How many replicas per shard would you like to create? [2]
 2
 Please choose a configuration for the gettingstarted collection, available 
 options are:
 basic_configs, data_driven_schema_configs, or sample_techproducts_configs 
 [data_driven_schema_configs]

 Error: Could not find or load main class org.apache.solr.util.SolrCLI


 I am sure this used to work before.

 But I am not able to figure out what's wrong.


 Any help would be greatly appreciated.

 Thanks
 Sachin




 --
 Regards,
 Shalin Shekhar Mangar.



bin/solr not working in trunk

2015-02-17 Thread S G
Hi,

I want to compile solr from the main trunk and run it.


I did the following:

git clone https://github.com/apache/lucene-solr.git
cd lucene-solr/solr
ant dist
bin/solr -e cloud


This creates the relevant solr nodes but fails to create a collection with
the following error:

$ bin/solr -e cloud

Welcome to the SolrCloud example!


This interactive session will help you launch a SolrCloud cluster on
your local workstation.

To begin, how many Solr nodes would you like to run in your local
cluster? (specify 1-4 nodes) [2]
Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.

Please enter the port for node1 [8983]
8983
Please enter the port for node2 [7574]
7574

Starting up SolrCloud node1 on port 8983 using command:

solr start -cloud -s example/cloud/node1/solr -p 8983


Waiting to see Solr listening on port 8983 [|]
Started Solr server on port 8983 (pid=94888). Happy searching!



Starting node2 on port 7574 using command:

solr start -cloud -s example/cloud/node2/solr -p 7574 -z localhost:9983


Waiting to see Solr listening on port 7574 [|]
Started Solr server on port 7574 (pid=94979). Happy searching!


Now let's create a new collection for indexing documents in your 2-node cluster.

Please provide a name for your new collection: [gettingstarted]
gettingstarted
How many shards would you like to split gettingstarted into? [2]
2
How many replicas per shard would you like to create? [2]
2
Please choose a configuration for the gettingstarted collection,
available options are:
basic_configs, data_driven_schema_configs, or
sample_techproducts_configs [data_driven_schema_configs]

Error: Could not find or load main class org.apache.solr.util.SolrCLI


I am sure this used to work before.

But I am not able to figure out what's wrong.


Any help would be greatly appreciated.

Thanks
Sachin


Re: Proactively going down in anticipation of high load / bad state

2015-02-17 Thread S G
Hi Erick,

I have submitted a patch at https://issues.apache.org/jira/browse/SOLR-7121
Will add tests to this if the approach is acceptable.

Thanks
Sachin


On Tue, Jan 6, 2015 at 12:26 PM, Erick Erickson erickerick...@gmail.com
wrote:

 If you have done the some coding, it's always appropriate to open a JIRA
 and attach the patch for discussion.

 Yonik's Law of Patches:

 A half-baked patch in Jira, with no documentation, no tests
 and no backwards compatibility is better than no patch at all.

 Even if the approach is shot down, it may spawn alternative approaches
 or stimulate thinking.

 Best,
 Erick

 On Tue, Jan 6, 2015 at 11:50 AM, S G sg.online.em...@gmail.com wrote:
  Hi,
 
  For a solr cloud, is there a setting that allows a core to proactively go
  down if its able to detect some temporary issues like high GC, high
  thread-counts, temporary network slow down etc. ?
  Currently we see that a node gets in a distributed deadlock because its
 not
  able to detect such situations.
 
  I am exploring Solr code to see if its possible to take some proactive
  action in such cases.
  One way could be to have configurable limits for GC time, thread-count,
  response-time, 5-minute-rate etc. and make a core shut down if it senses
  problems.
  Once that happens, a background thread will monitor the trouble causing
  parameters and recover the downed core when situation improves.
 
 
  My current patch can bring down a core for:
  1) High thread-counts,
  2) High 95thPcRequestTime,
  3) Huge # of heavy queries in a given time.
 
  The patch also recovers the core when its health improves.
 
 
  If the above seems doable, then I can create a JIRA for more discussion
 and
  implementation.
 
 
  Thanks
  Sachin

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




How to ant package lucene 4.10 branch

2015-02-04 Thread S G
Hi,

I tried to run ant package on lucene 4.10.x branch that was checked out
using github.
But that gave me an error that it needed to be checked out using svn.

Then I checked out using svn and ran ant package
But on doing so, it created ./package/solr-6.0.0-SNAPSHOT.zip and not the
solr-4.10.3.zip

Any idea how I can get it to build solr-4.10.3.zip?


I tried branch switching in svn too, but no help.
(like: svn switch
https://github.com/sachingsachin/lucene-solr/branches/all_changes)

Thanks
Sachin


Re: Querying locally before sending a distributed request

2015-01-20 Thread S G
I have submitted a patch for the ticket at
https://issues.apache.org/jira/browse/SOLR-6832

The patch creates an option *preferLocalShards* in solrconfig.xml and in
the query request params (giving more preference to the one in the query).

If this option is set,
HttpShardHandler.preferCurrentHostForDistributedReq() tries to find a local
URL and puts that URL as the first one in the list of URLs sent to
LBHttpSolrServer.
This ensures that the current host's cores will be given preference for
distributed queries.

Current host's URL is found by ResponseBuilder.findCurrentHostAddress() by
searching for current core's name in the list of shards.
Default value of the option is kept as 'false' to ensure normal behavior.

Before putting more effort in writing test-cases, I would like to have some
comments on this patch so that I can know that I am in the right direction
here.

Thanks
Sachin


On Wed, Dec 10, 2014 at 4:30 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/9/2014 10:55 PM, S G wrote:
  For a distributed query, the request is always sent to all the shards
  even if the originating SolrCore (handling the original distributed
  query) is a replica of one of the shards.
  If the original Solr-Core can check itself before sending http
  requests for any shard, we can probably save some network hopping and
  gain some performance.

 I have to agree with the other replies you've gotten.

 Consider a SolrCloud that is handling 5000 requests per second with a
 replicationFactor of 20 or 30.  This could be one shard or multiple
 shards.  Currently, those requests will be load balanced to the entire
 cluster.  If this option is implemented, suddenly EVERY request will
 have at least one part handled locally ... and unless the index is very
 tiny or 99 percent of the queries hit a Solr cache, one index core
 simply won't be able to handle 5000 queries per second.  Getting a
 single machine capable of handling that load MIGHT be possible, but it
 would likely be *VERY* expensive.

 This would be great as an *OPTION* that can be enabled when the index
 composition and query patterns dictate it will be beneficial ... but it
 definitely should not be default behavior.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Proactively going down in anticipation of high load / bad state

2015-01-06 Thread S G
Hi,

For a solr cloud, is there a setting that allows a core to proactively go
down if its able to detect some temporary issues like high GC, high
thread-counts, temporary network slow down etc. ?
Currently we see that a node gets in a distributed deadlock because its not
able to detect such situations.

I am exploring Solr code to see if its possible to take some proactive
action in such cases.
One way could be to have configurable limits for GC time, thread-count,
response-time, 5-minute-rate etc. and make a core shut down if it senses
problems.
Once that happens, a background thread will monitor the trouble causing
parameters and recover the downed core when situation improves.


My current patch can bring down a core for:
1) High thread-counts,
2) High 95thPcRequestTime,
3) Huge # of heavy queries in a given time.

The patch also recovers the core when its health improves.


If the above seems doable, then I can create a JIRA for more discussion and
implementation.


Thanks
Sachin


Re: Querying locally before sending a distributed request

2014-12-17 Thread S G
I have submitted a patch for this at
https://issues.apache.org/jira/browse/SOLR-6832
Would appreciate if someone can review it.

Thanks
SG

On Wed, Dec 10, 2014 at 4:30 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/9/2014 10:55 PM, S G wrote:
  For a distributed query, the request is always sent to all the shards
  even if the originating SolrCore (handling the original distributed
  query) is a replica of one of the shards.
  If the original Solr-Core can check itself before sending http
  requests for any shard, we can probably save some network hopping and
  gain some performance.

 I have to agree with the other replies you've gotten.

 Consider a SolrCloud that is handling 5000 requests per second with a
 replicationFactor of 20 or 30.  This could be one shard or multiple
 shards.  Currently, those requests will be load balanced to the entire
 cluster.  If this option is implemented, suddenly EVERY request will
 have at least one part handled locally ... and unless the index is very
 tiny or 99 percent of the queries hit a Solr cache, one index core
 simply won't be able to handle 5000 queries per second.  Getting a
 single machine capable of handling that load MIGHT be possible, but it
 would likely be *VERY* expensive.

 This would be great as an *OPTION* that can be enabled when the index
 composition and query patterns dictate it will be beneficial ... but it
 definitely should not be default behavior.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Querying locally before sending a distributed request

2014-12-10 Thread S G
I have opened https://issues.apache.org/jira/browse/SOLR-6832 to track this.

The performance gain increases if coresPerMachine is  1 and a single JVM
has cores from 'k' shards.

We can also look into giving more preference to machines with same IP
address as current machine (when multiple tomcats are running on same
machine).


On Wed, Dec 10, 2014 at 7:14 AM, Steve Davids sdav...@gmail.com wrote:

 bq. In a one-shard case, no query really needs to be forwarded, since any
 replica can fully get the results so in this case no query would be
 forwarded.

 You can pass the request param distrib=false to not distribute the request
 in that particular case at which point it will only gather results from
 that particular host.

 As for the SolrCloud example with n-shards  1 your overall search request
 time is limited to the slowest shard's response time. So, you would
 potentially be saving one hop, but you are still making n-1 other hops to
 gather all of the other shard's results thus making it a moot point since
 you will be waiting on the other shards to respond before you can return
 the aggregated result list. You will then be on the hook to setup the load
 balancing across replicas of that one particular host you have chosen to
 query as Erick said which could have some gotchyas for people not expecting
 that behavior.

 -Steve

 On Wed, Dec 10, 2014 at 9:26 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 Just skimming, but if I'm reading this right, your suggestion is
 that queries be served locally rather than being forwarded to
 another replica when possible.

 So let's take the one-shard case with N replicas to make sure
 I understand. In a one-shard case, no query really needs to
 be forwarded, since any replica can fully get the results so
 in this case no query would be forwarded.

 If this is a fair summary, then consider the situation where the
 outside world connects to a single server rather than to a
 fronting load balancer. Then only one shard would be doing
 any work

 Or am I off in the weeds?

 That aside, if I've gotten it wrong and you want to put
 up a patch (or even just outline a better approach),
 feel free to open a JIRA and attach a patch...

 Best,
 Erick

 On Tue, Dec 9, 2014 at 11:55 PM, S G sg.online.em...@gmail.com wrote:
  Hello Solr Devs,
 
  I am a developer using Solr and wanted to have some opinion on a
 performance
  change request.
 
  Currently, I see that code flow for a query in SolrCloud is as follows:
 
  For distributed query:
  SolrCore - SearchHandler.handleRequestBody() -
 HttpShardHandler.submit()
 
  For non-distributed query:
  SolrCore - SearchHandler.handleRequestBody() -
 QueryComponent.process()
 
 
  For a distributed query, the request is always sent to all the shards
 even
  if the originating SolrCore (handling the original distributed query)
 is a
  replica of one of the shards.
  If the original Solr-Core can check itself before sending http requests
 for
  any shard, we can probably save some network hopping and gain some
  performance.
 
  If this idea seems feasible, I can submit a JIRA ticket and work on it.
  I am planning to change SearchHandler.handleRequestBody() or
  HttpShardHandler.submit()
 
  Thanks
  SG
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





Querying locally before sending a distributed request

2014-12-09 Thread S G
Hello Solr Devs,

I am a developer using Solr and wanted to have some opinion on a
performance change request.

Currently, I see that code flow for a query in SolrCloud is as follows:

For distributed query:
SolrCore - SearchHandler.handleRequestBody() - HttpShardHandler.submit()

For non-distributed query:
SolrCore - SearchHandler.handleRequestBody() - QueryComponent.process()


For a distributed query, the request is always sent to all the shards even
if the originating SolrCore (handling the original distributed query) is a
replica of one of the shards.
If the original Solr-Core can check itself before sending http requests for
any shard, we can probably save some network hopping and gain some
performance.

If this idea seems feasible, I can submit a JIRA ticket and work on it.
I am planning to change SearchHandler.handleRequestBody() or
HttpShardHandler.submit()

Thanks
SG