Re: Commit disabled

2019-11-08 Thread Erick Erickson
Please explain the use case more fully, as what you’re asking makes little 
sense. 

You say “manually indexed the item with changes”. How does that change get to 
Solr? The autocommit settings are all about how long it takes a doc _after_ 
it’s indexed in Solr to be searchable. How it gets to Solr has nothing to do 
with these settings.

Here’s more than you want to know about commits: 
https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Also, you mentioned Sitecore, which uses Solr. Perhaps this more a question for 
Sitecore?

Best,
Erick

> On Nov 8, 2019, at 7:53 AM, Villacorta, David (Arlington) 
>  wrote:
> 
> Thanks for the feedback
> 
> Is there a config setting that can be used for explicit commit? I was 
> thinking the  should be handling this already?
> In our issue, the changes will only be reflected back to sitecore once we 
> manually indexed  the item with changes
> 
> Regards
> David Villacorta
> 
> -Original Message-
> From: Emir Arnautović [mailto:emir.arnauto...@sematext.com]
> Sent: Friday, November 08, 2019 7:53 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Commit disabled
> 
> Hi David,
> Index will get updated (hard commit is happening every 15s) but changes will 
> not be visible until you explicitly commit or you reload core. Note that Solr 
> restart reloads cores.
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr & 
> Elasticsearch Consulting Support Training - 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__sematext.com_=DwIFAg=3NBXXUKukgVIjVXwt0Rin6h0GAxIKZespWWvcJx4w9c=hHHYgXsMRB8bPM5zNhKSH56W7zaV_SQcrmlwXd5ocLI0qfMw_ySz2DWVBjaVtE7v=Elg8qsST_TFKjg7Ti53TOSeAEzjrdqn_9X5gqbLJezw=5I_RPlXE6z0MGcaCMeNTekm90bN2m81prJ5pJUQFxEo=
> 
> 
> 
>> On 8 Nov 2019, at 12:19, Villacorta, David (Arlington) 
>>  wrote:
>> 
>> Just want to confirm, given the following config settings at solrconfig.xml:
>> 
>> 
>> ${solr.autoCommit.maxTime:15000}
>> false
>>   
>> 
>> 
>> ${solr.autoSoftCommit.maxTime:-1}
>> 
>> 
>> Solr index will not be updated unless created item in Sitecore is manually 
>> indexed, right?
>> 
>> Regards
>> David Villacorta
>> 
>> Notice of Confidentiality
>> This email contains confidential material prepared for the intended 
>> addressees only and it may contain intellectual property of Willis Towers 
>> Watson, its affiliates or a third party. This material may not be suitable 
>> for, and we accept no responsibility for, use in any context or for any 
>> purpose other than for the intended context and purpose. If you are not the 
>> intended recipient or if we did not authorize your receipt of this material, 
>> any use, distribution or copying of this material is strictly prohibited and 
>> may be unlawful. If you have received this communication in error, please 
>> return it to the original sender with the subject heading "Received in 
>> error," then delete any copies.
>> 
>> You may receive direct marketing communications from Willis Towers Watson. 
>> If so, you have the right to opt out of these communications. You can opt 
>> out of these communications or request a copy of Willis Towers Watson's 
>> privacy notice by emailing 
>> unsubscr...@willistowerswatson.com<mailto:unsubscr...@willistowerswatson.com>.
>> 
>> 
>> This e-mail has come to you from Willis Towers Watson US LLC
> 
> Notice of Confidentiality
> This email contains confidential material prepared for the intended 
> addressees only and it may contain intellectual property of Willis Towers 
> Watson, its affiliates or a third party. This material may not be suitable 
> for, and we accept no responsibility for, use in any context or for any 
> purpose other than for the intended context and purpose. If you are not the 
> intended recipient or if we did not authorize your receipt of this material, 
> any use, distribution or copying of this material is strictly prohibited and 
> may be unlawful. If you have received this communication in error, please 
> return it to the original sender with the subject heading "Received in 
> error," then delete any copies.
> 
> You may receive direct marketing communications from Willis Towers Watson. If 
> so, you have the right to opt out of these communications. You can opt out of 
> these communications or request a copy of Willis Towers Watson's privacy 
> notice by emailing 
> unsubscr...@willistowerswatson.com<mailto:unsubscr...@willistowerswatson.com>.
> 
> 
> This e-mail has come to you from Willis Towers Watson US LLC



RE: Commit disabled

2019-11-08 Thread Villacorta, David (Arlington)
Thanks for the feedback

Is there a config setting that can be used for explicit commit? I was thinking 
the  should be handling this already?
In our issue, the changes will only be reflected back to sitecore once we 
manually indexed  the item with changes

Regards
David Villacorta

-Original Message-
From: Emir Arnautović [mailto:emir.arnauto...@sematext.com]
Sent: Friday, November 08, 2019 7:53 PM
To: solr-user@lucene.apache.org
Subject: Re: Commit disabled

Hi David,
Index will get updated (hard commit is happening every 15s) but changes will 
not be visible until you explicitly commit or you reload core. Note that Solr 
restart reloads cores.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch 
Consulting Support Training - 
https://urldefense.proofpoint.com/v2/url?u=http-3A__sematext.com_=DwIFAg=3NBXXUKukgVIjVXwt0Rin6h0GAxIKZespWWvcJx4w9c=hHHYgXsMRB8bPM5zNhKSH56W7zaV_SQcrmlwXd5ocLI0qfMw_ySz2DWVBjaVtE7v=Elg8qsST_TFKjg7Ti53TOSeAEzjrdqn_9X5gqbLJezw=5I_RPlXE6z0MGcaCMeNTekm90bN2m81prJ5pJUQFxEo=



> On 8 Nov 2019, at 12:19, Villacorta, David (Arlington) 
>  wrote:
>
> Just want to confirm, given the following config settings at solrconfig.xml:
>
> 
>  ${solr.autoCommit.maxTime:15000}
>  false
>
>
> 
>  ${solr.autoSoftCommit.maxTime:-1}
> 
>
> Solr index will not be updated unless created item in Sitecore is manually 
> indexed, right?
>
> Regards
> David Villacorta
>
> Notice of Confidentiality
> This email contains confidential material prepared for the intended 
> addressees only and it may contain intellectual property of Willis Towers 
> Watson, its affiliates or a third party. This material may not be suitable 
> for, and we accept no responsibility for, use in any context or for any 
> purpose other than for the intended context and purpose. If you are not the 
> intended recipient or if we did not authorize your receipt of this material, 
> any use, distribution or copying of this material is strictly prohibited and 
> may be unlawful. If you have received this communication in error, please 
> return it to the original sender with the subject heading "Received in 
> error," then delete any copies.
>
> You may receive direct marketing communications from Willis Towers Watson. If 
> so, you have the right to opt out of these communications. You can opt out of 
> these communications or request a copy of Willis Towers Watson's privacy 
> notice by emailing 
> unsubscr...@willistowerswatson.com<mailto:unsubscr...@willistowerswatson.com>.
>
>
> This e-mail has come to you from Willis Towers Watson US LLC

Notice of Confidentiality
This email contains confidential material prepared for the intended addressees 
only and it may contain intellectual property of Willis Towers Watson, its 
affiliates or a third party. This material may not be suitable for, and we 
accept no responsibility for, use in any context or for any purpose other than 
for the intended context and purpose. If you are not the intended recipient or 
if we did not authorize your receipt of this material, any use, distribution or 
copying of this material is strictly prohibited and may be unlawful. If you 
have received this communication in error, please return it to the original 
sender with the subject heading "Received in error," then delete any copies.

You may receive direct marketing communications from Willis Towers Watson. If 
so, you have the right to opt out of these communications. You can opt out of 
these communications or request a copy of Willis Towers Watson's privacy notice 
by emailing 
unsubscr...@willistowerswatson.com<mailto:unsubscr...@willistowerswatson.com>.


This e-mail has come to you from Willis Towers Watson US LLC


Re: Commit disabled

2019-11-08 Thread Emir Arnautović
Hi David,
Index will get updated (hard commit is happening every 15s) but changes will 
not be visible until you explicitly commit or you reload core. Note that Solr 
restart reloads cores.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 8 Nov 2019, at 12:19, Villacorta, David (Arlington) 
>  wrote:
> 
> Just want to confirm, given the following config settings at solrconfig.xml:
> 
> 
>  ${solr.autoCommit.maxTime:15000}
>  false
>
> 
> 
>  ${solr.autoSoftCommit.maxTime:-1}
> 
> 
> Solr index will not be updated unless created item in Sitecore is manually 
> indexed, right?
> 
> Regards
> David Villacorta
> 
> Notice of Confidentiality
> This email contains confidential material prepared for the intended 
> addressees only and it may contain intellectual property of Willis Towers 
> Watson, its affiliates or a third party. This material may not be suitable 
> for, and we accept no responsibility for, use in any context or for any 
> purpose other than for the intended context and purpose. If you are not the 
> intended recipient or if we did not authorize your receipt of this material, 
> any use, distribution or copying of this material is strictly prohibited and 
> may be unlawful. If you have received this communication in error, please 
> return it to the original sender with the subject heading "Received in 
> error," then delete any copies.
> 
> You may receive direct marketing communications from Willis Towers Watson. If 
> so, you have the right to opt out of these communications. You can opt out of 
> these communications or request a copy of Willis Towers Watson's privacy 
> notice by emailing 
> unsubscr...@willistowerswatson.com.
> 
> 
> This e-mail has come to you from Willis Towers Watson US LLC



Re: Commit too slow?

2018-05-14 Thread Shawn Heisey
On 5/14/2018 11:29 AM, LOPEZ-CORTES Mariano-ext wrote:
> After having injecting 200 documents in our Solr server, the commit 
> operation at the end of the process (using ConcurrentUpdateSolrClient) take 
> 10 minutes. It's too slow?

There is a wiki page discussing slow commits:

https://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits

Thanks,
Shawn



Re: commit time in solr cloud

2017-09-11 Thread Susheel Kumar
Hi Wei,

I'm assuming the lastModified time is when latest hard commit happens. Is
that correct?

>> Yes. its correct.

I also see sometime difference between replicas and leader commit
timestamps where the "diff/lag < autoCommit interval". So in your case you
noticed like upto 10 mins.
My guess is due to different start time of these replica/leader node, their
commit may happen at different times and thus you would see the difference.

Thanks,
Susheel

On Fri, Sep 8, 2017 at 3:06 PM, Wei  wrote:

> Hi,
>
> In solr cloud we want to track the last commit time on each node. The
> information source is from the luke handler:
>  admin/luke?numTerms=0=json, e.g.
>
>
>- userData:
>{
>   - commitTimeMSec: "1504895505447"
>   },
>- lastModified: "2017-09-08T18:31:45.447Z"
>
>
>
> I'm assuming the lastModified time is when latest hard commit happens. Is
> that correct?
>
> On all nodes we have autoCommit set to 15 mins interval. One observation I
> don't  understand is quite often the last commit time on shard leaders lags
> behind the last commit time on replicas, some times the lag is over 10
> minutes.  My understanding is that as update requests goes to leader first,
> the timer on the leaders would start earlier than the replicas. Am I
> missing something here?
>
> Thanks,
> Wei
>


Re: Commit/callbacks doesn't happen on core close

2017-02-07 Thread saiks
Thanks for the reply.

The issue is, when the core is unloaded, post commit listeners on the core
are not getting called.

If you see here, the code that calls post commit listeners is commented out.
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java#L789

To reproduce,

1. Have Solr configured with transient size of 1
2. Set softCommit, hardCommit to 1 minute (Commit happens after a minute)
3. Create 2 cores, core1 and core2
4. Ingest into core1, at this point core1 is loaded, core2 should be in
closed state.
5. If we now ingest into or read from core2, after say 10 seconds, core1 is
indexed and closed but commit listeners don't get called.

Thank you



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-callbacks-doesn-t-happen-on-core-close-tp4316015p4319220.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit/callbacks doesn't happen on core close

2017-01-30 Thread alessandro.benedetti
Hi Saiks,
I am not following you.
According to the Solr documentation :
"transient=["true"|"false"]. Whether the core should be put in the LRU list
of cores that may be unloaded. NOTE: When a core is unloaded, any
outstanding operations (indexing or query) will be completed before the core
is closed.
old-style: this is an attribute of each individual  tag
new-style: this is an entry in the "core.properties" file for each core."

So if you index 2 cores, even if one of them is selected by the transient
cache to be dismissed, it should carry out all the pending operations (
indexing included).
Also each core is independent, and it has independent configuration (
solrconfig and schema).
So, indexing on both should produce effects on both.

probably I am missing the point, can you explain further ? Do you have any
exception/weird thing happening log side ?

Cheers



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-callbacks-doesn-t-happen-on-core-close-tp4316015p4317795.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit/callbacks doesn't happen on core close

2017-01-29 Thread saiks
Hi All,

We are a big public company and we are evaluating Solr to store hundreds of
tera bytes of data.
Post commit listeners getting called on core close is a must for us.

It would be great if anyone can help us fix the issue or suggest a
workaround :)

Thank you



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-callbacks-doesn-t-happen-on-core-close-tp4316015p4317762.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit required after delete ?

2017-01-12 Thread Mikhail Khludnev
Alessandro,
I'm not sure which code reference are you asking about, but here they are:
http://lucene.apache.org/core/6_3_0/core/org/apache/lucene/index/DirectoryReader.html#openIfChanged-org.apache.lucene.index.DirectoryReader-org.apache.lucene.index.IndexWriter-boolean-
http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html


On Thu, Jan 12, 2017 at 2:09 PM, alessandro.benedetti  wrote:

> Interesting Michael, can you pass me the code reference?
>
> Cheers
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Commit-required-after-delete-tp4312697p4313692.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev


Re: Commit required after delete ?

2017-01-12 Thread alessandro.benedetti
Interesting Michael, can you pass me the code reference?

Cheers



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-required-after-delete-tp4312697p4313692.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit required after delete ?

2017-01-06 Thread Mikhail Khludnev
Hello, Friend!
You absolutely need to commit to make delete visible. And even more, when
"softCommit" is issued in Lucene level, there is a flag which ignores
deletes for sake of performance.

06 янв. 2017 г. 10:55 пользователь "Dorian Hoxha" 
написал:

Hello friends,

Based on what I've read, I think "commit" isn't needed to make deletes
active (like we do with index/update), right ?

Since it just marks an in-memory deleted-id bitmap, right ?

Thank You


Re: commit it taking 1300 ms

2016-09-02 Thread Pushkar Raste
It would be worth looking into iostats of your disks.

On Aug 22, 2016 10:11 AM, "Alessandro Benedetti" 
wrote:

> I agree with the suggestions so far.
> The cache auto-warming doesn't seem the problem as the index is not massive
> and the auto-warm is for only 10 docs.
> Are you using any warming query for the new searcher ?
>
> Are you using soft or hard commit ?
> This can make the difference ( soft are much cheaper, not free but cheaper)
> .
> You said :
> " Actually earlier it was taking less but suddenly it has increased "
>
> What happened ?
> Anyway, there are a lot of questions to answer before we can help you...
>
> Cheers
>
> On Fri, Aug 12, 2016 at 4:58 AM, Esther-Melaine Quansah <
> esther.quan...@lucidworks.com> wrote:
>
> > Midas,
> >
> > I’d like further clarification as well. Are you sending commits along
> with
> > each document that you’re POSTing to Solr? If so, you’re essentially
> either
> > opening a new searcher or flushing to disk with each POST which could
> > explain latency between each request.
> >
> > Thanks,
> >
> > Esther
> > > On Aug 11, 2016, at 12:19 PM, Erick Erickson 
> > wrote:
> > >
> > > bq:  we post json documents through the curl it takes the time (same
> > time i
> > > would like to say that we are not hard committing ). that curl takes
> time
> > > i.e. 1.3 sec.
> > >
> > > OK, I'm really confused. _what_ is taking 1.3 seconds? When you said
> > > commit, I was thinking of Solr's commit operation, which is totally
> > distinct
> > > from just adding a doc to the index. But I read the above statement
> > > as you're saying it takes 1.3 seconds just to send a doc to Solr.
> > >
> > > Let's see the exact curl command you're using please?
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > On Thu, Aug 11, 2016 at 5:32 AM, Emir Arnautovic
> > >  wrote:
> > >> Hi Midas,
> > >>
> > >> 1. How many indexing threads?
> > >> 2. Do you batch documents and what is your batch size?
> > >> 3. How frequently do you commit?
> > >>
> > >> I would recommend:
> > >> 1. Move commits to Solr (set auto soft commit to max allowed time)
> > >> 2. Use batches (bulks)
> > >> 3. tune bulk size and number of threads to achieve max performance.
> > >>
> > >> Thanks,
> > >> Emir
> > >>
> > >>
> > >>
> > >> On 11.08.2016 08:21, Midas A wrote:
> > >>>
> > >>> Emir,
> > >>>
> > >>> other queries:
> > >>>
> > >>> a) Solr cloud : NO
> > >>> b)  > >>> size="5000" initialSize="5000" autowarmCount="10"/>
> > >>> c)   > >>> size="1000" initialSize="1000" autowarmCount="10"/>
> > >>> d)  > >>> size="1000" initialSize="1000" autowarmCount="10"/>
> > >>> e) we are using multi threaded system.
> > >>>
> > >>> On Thu, Aug 11, 2016 at 11:48 AM, Midas A 
> > wrote:
> > >>>
> >  Emir,
> > 
> >  we post json documents through the curl it takes the time (same
> time i
> >  would like to say that we are not hard committing ). that curl takes
> > time
> >  i.e. 1.3 sec.
> > 
> >  On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
> >  emir.arnauto...@sematext.com> wrote:
> > 
> > > Hi Midas,
> > >
> > > According to your autocommit configuration and your worry about
> > commit
> > > time I assume that you are doing explicit commits from client code
> > and
> > > that
> > > 1.3s is client observed commit time. If that is the case, than it
> > might
> > > be
> > > opening searcher that is taking time.
> > >
> > > How do you index data - single threaded or multithreaded? How
> > frequently
> > > do you commit from client? Can you let Solr do soft commits instead
> > of
> > > explicitly committing? Do you have warmup queries? Is this
> SolrCloud?
> > > What
> > > is number of servers (what spec), shards, docs?
> > >
> > > In any case monitoring can give you more info about server/Solr
> > behavior
> > > and help you diagnose issues more easily/precisely. One such
> > monitoring
> > > tool is our SPM .
> > >
> > > Regards,
> > > Emir
> > >
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > > On 10.08.2016 05:20, Midas A wrote:
> > >
> > >> Thanks for replying
> > >>
> > >> index size:9GB
> > >> 2000 docs/sec.
> > >>
> > >> Actually earlier it was taking less but suddenly it has increased
> .
> > >>
> > >> Currently we do not have any monitoring  tool.
> > >>
> > >> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
> > >> emir.arnauto...@sematext.com> wrote:
> > >>
> > >> Hi Midas,
> > >>>
> > >>> Can you give us more details on your index: size, number of new
> > docs
> > >>> between commits. Why do you think 1.3s for commit is to much and
> > why
> > >>> do
> > >>> you
> > 

Re: commit it taking 1300 ms

2016-08-22 Thread Alessandro Benedetti
I agree with the suggestions so far.
The cache auto-warming doesn't seem the problem as the index is not massive
and the auto-warm is for only 10 docs.
Are you using any warming query for the new searcher ?

Are you using soft or hard commit ?
This can make the difference ( soft are much cheaper, not free but cheaper)
.
You said :
" Actually earlier it was taking less but suddenly it has increased "

What happened ?
Anyway, there are a lot of questions to answer before we can help you...

Cheers

On Fri, Aug 12, 2016 at 4:58 AM, Esther-Melaine Quansah <
esther.quan...@lucidworks.com> wrote:

> Midas,
>
> I’d like further clarification as well. Are you sending commits along with
> each document that you’re POSTing to Solr? If so, you’re essentially either
> opening a new searcher or flushing to disk with each POST which could
> explain latency between each request.
>
> Thanks,
>
> Esther
> > On Aug 11, 2016, at 12:19 PM, Erick Erickson 
> wrote:
> >
> > bq:  we post json documents through the curl it takes the time (same
> time i
> > would like to say that we are not hard committing ). that curl takes time
> > i.e. 1.3 sec.
> >
> > OK, I'm really confused. _what_ is taking 1.3 seconds? When you said
> > commit, I was thinking of Solr's commit operation, which is totally
> distinct
> > from just adding a doc to the index. But I read the above statement
> > as you're saying it takes 1.3 seconds just to send a doc to Solr.
> >
> > Let's see the exact curl command you're using please?
> >
> > Best,
> > Erick
> >
> >
> > On Thu, Aug 11, 2016 at 5:32 AM, Emir Arnautovic
> >  wrote:
> >> Hi Midas,
> >>
> >> 1. How many indexing threads?
> >> 2. Do you batch documents and what is your batch size?
> >> 3. How frequently do you commit?
> >>
> >> I would recommend:
> >> 1. Move commits to Solr (set auto soft commit to max allowed time)
> >> 2. Use batches (bulks)
> >> 3. tune bulk size and number of threads to achieve max performance.
> >>
> >> Thanks,
> >> Emir
> >>
> >>
> >>
> >> On 11.08.2016 08:21, Midas A wrote:
> >>>
> >>> Emir,
> >>>
> >>> other queries:
> >>>
> >>> a) Solr cloud : NO
> >>> b)  >>> size="5000" initialSize="5000" autowarmCount="10"/>
> >>> c)   >>> size="1000" initialSize="1000" autowarmCount="10"/>
> >>> d)  >>> size="1000" initialSize="1000" autowarmCount="10"/>
> >>> e) we are using multi threaded system.
> >>>
> >>> On Thu, Aug 11, 2016 at 11:48 AM, Midas A 
> wrote:
> >>>
>  Emir,
> 
>  we post json documents through the curl it takes the time (same time i
>  would like to say that we are not hard committing ). that curl takes
> time
>  i.e. 1.3 sec.
> 
>  On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
>  emir.arnauto...@sematext.com> wrote:
> 
> > Hi Midas,
> >
> > According to your autocommit configuration and your worry about
> commit
> > time I assume that you are doing explicit commits from client code
> and
> > that
> > 1.3s is client observed commit time. If that is the case, than it
> might
> > be
> > opening searcher that is taking time.
> >
> > How do you index data - single threaded or multithreaded? How
> frequently
> > do you commit from client? Can you let Solr do soft commits instead
> of
> > explicitly committing? Do you have warmup queries? Is this SolrCloud?
> > What
> > is number of servers (what spec), shards, docs?
> >
> > In any case monitoring can give you more info about server/Solr
> behavior
> > and help you diagnose issues more easily/precisely. One such
> monitoring
> > tool is our SPM .
> >
> > Regards,
> > Emir
> >
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> > On 10.08.2016 05:20, Midas A wrote:
> >
> >> Thanks for replying
> >>
> >> index size:9GB
> >> 2000 docs/sec.
> >>
> >> Actually earlier it was taking less but suddenly it has increased .
> >>
> >> Currently we do not have any monitoring  tool.
> >>
> >> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
> >> emir.arnauto...@sematext.com> wrote:
> >>
> >> Hi Midas,
> >>>
> >>> Can you give us more details on your index: size, number of new
> docs
> >>> between commits. Why do you think 1.3s for commit is to much and
> why
> >>> do
> >>> you
> >>> need it to take less? Did you do any system/Solr monitoring?
> >>>
> >>> Emir
> >>>
> >>>
> >>> On 09.08.2016 14:10, Midas A wrote:
> >>>
> >>> please reply it is urgent.
> 
>  On Tue, Aug 9, 2016 at 11:17 AM, Midas A 
>  wrote:
> 
>  Hi ,
> 
> > commit is taking more than 1300 ms . what should i check on
> server.
> >
> 

Re: commit it taking 1300 ms

2016-08-12 Thread Esther-Melaine Quansah
Midas,

I’d like further clarification as well. Are you sending commits along with each 
document that you’re POSTing to Solr? If so, you’re essentially either opening 
a new searcher or flushing to disk with each POST which could explain latency 
between each request.

Thanks,

Esther
> On Aug 11, 2016, at 12:19 PM, Erick Erickson  wrote:
> 
> bq:  we post json documents through the curl it takes the time (same time i
> would like to say that we are not hard committing ). that curl takes time
> i.e. 1.3 sec.
> 
> OK, I'm really confused. _what_ is taking 1.3 seconds? When you said
> commit, I was thinking of Solr's commit operation, which is totally distinct
> from just adding a doc to the index. But I read the above statement
> as you're saying it takes 1.3 seconds just to send a doc to Solr.
> 
> Let's see the exact curl command you're using please?
> 
> Best,
> Erick
> 
> 
> On Thu, Aug 11, 2016 at 5:32 AM, Emir Arnautovic
>  wrote:
>> Hi Midas,
>> 
>> 1. How many indexing threads?
>> 2. Do you batch documents and what is your batch size?
>> 3. How frequently do you commit?
>> 
>> I would recommend:
>> 1. Move commits to Solr (set auto soft commit to max allowed time)
>> 2. Use batches (bulks)
>> 3. tune bulk size and number of threads to achieve max performance.
>> 
>> Thanks,
>> Emir
>> 
>> 
>> 
>> On 11.08.2016 08:21, Midas A wrote:
>>> 
>>> Emir,
>>> 
>>> other queries:
>>> 
>>> a) Solr cloud : NO
>>> b) >> size="5000" initialSize="5000" autowarmCount="10"/>
>>> c)  >> size="1000" initialSize="1000" autowarmCount="10"/>
>>> d) >> size="1000" initialSize="1000" autowarmCount="10"/>
>>> e) we are using multi threaded system.
>>> 
>>> On Thu, Aug 11, 2016 at 11:48 AM, Midas A  wrote:
>>> 
 Emir,
 
 we post json documents through the curl it takes the time (same time i
 would like to say that we are not hard committing ). that curl takes time
 i.e. 1.3 sec.
 
 On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
 emir.arnauto...@sematext.com> wrote:
 
> Hi Midas,
> 
> According to your autocommit configuration and your worry about commit
> time I assume that you are doing explicit commits from client code and
> that
> 1.3s is client observed commit time. If that is the case, than it might
> be
> opening searcher that is taking time.
> 
> How do you index data - single threaded or multithreaded? How frequently
> do you commit from client? Can you let Solr do soft commits instead of
> explicitly committing? Do you have warmup queries? Is this SolrCloud?
> What
> is number of servers (what spec), shards, docs?
> 
> In any case monitoring can give you more info about server/Solr behavior
> and help you diagnose issues more easily/precisely. One such monitoring
> tool is our SPM .
> 
> Regards,
> Emir
> 
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
> 
> On 10.08.2016 05:20, Midas A wrote:
> 
>> Thanks for replying
>> 
>> index size:9GB
>> 2000 docs/sec.
>> 
>> Actually earlier it was taking less but suddenly it has increased .
>> 
>> Currently we do not have any monitoring  tool.
>> 
>> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
>> emir.arnauto...@sematext.com> wrote:
>> 
>> Hi Midas,
>>> 
>>> Can you give us more details on your index: size, number of new docs
>>> between commits. Why do you think 1.3s for commit is to much and why
>>> do
>>> you
>>> need it to take less? Did you do any system/Solr monitoring?
>>> 
>>> Emir
>>> 
>>> 
>>> On 09.08.2016 14:10, Midas A wrote:
>>> 
>>> please reply it is urgent.
 
 On Tue, Aug 9, 2016 at 11:17 AM, Midas A 
 wrote:
 
 Hi ,
 
> commit is taking more than 1300 ms . what should i check on server.
> 
> below is my configuration .
> 
>  ${solr.autoCommit.maxTime:15000} <
> openSearcher>false  
> 
> ${solr.autoSoftCommit.maxTime:-1} 
> 
> 
> 
> --
>>> 
>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>>> Solr & Elasticsearch Support * http://sematext.com/
>>> 
>>> 
>>> 
>> 
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>> 



Re: commit it taking 1300 ms

2016-08-11 Thread Erick Erickson
bq:  we post json documents through the curl it takes the time (same time i
would like to say that we are not hard committing ). that curl takes time
i.e. 1.3 sec.

OK, I'm really confused. _what_ is taking 1.3 seconds? When you said
commit, I was thinking of Solr's commit operation, which is totally distinct
from just adding a doc to the index. But I read the above statement
as you're saying it takes 1.3 seconds just to send a doc to Solr.

Let's see the exact curl command you're using please?

Best,
Erick


On Thu, Aug 11, 2016 at 5:32 AM, Emir Arnautovic
 wrote:
> Hi Midas,
>
> 1. How many indexing threads?
> 2. Do you batch documents and what is your batch size?
> 3. How frequently do you commit?
>
> I would recommend:
> 1. Move commits to Solr (set auto soft commit to max allowed time)
> 2. Use batches (bulks)
> 3. tune bulk size and number of threads to achieve max performance.
>
> Thanks,
> Emir
>
>
>
> On 11.08.2016 08:21, Midas A wrote:
>>
>> Emir,
>>
>> other queries:
>>
>> a) Solr cloud : NO
>> b) > size="5000" initialSize="5000" autowarmCount="10"/>
>> c)  > size="1000" initialSize="1000" autowarmCount="10"/>
>> d) > size="1000" initialSize="1000" autowarmCount="10"/>
>> e) we are using multi threaded system.
>>
>> On Thu, Aug 11, 2016 at 11:48 AM, Midas A  wrote:
>>
>>> Emir,
>>>
>>> we post json documents through the curl it takes the time (same time i
>>> would like to say that we are not hard committing ). that curl takes time
>>> i.e. 1.3 sec.
>>>
>>> On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
>>> emir.arnauto...@sematext.com> wrote:
>>>
 Hi Midas,

 According to your autocommit configuration and your worry about commit
 time I assume that you are doing explicit commits from client code and
 that
 1.3s is client observed commit time. If that is the case, than it might
 be
 opening searcher that is taking time.

 How do you index data - single threaded or multithreaded? How frequently
 do you commit from client? Can you let Solr do soft commits instead of
 explicitly committing? Do you have warmup queries? Is this SolrCloud?
 What
 is number of servers (what spec), shards, docs?

 In any case monitoring can give you more info about server/Solr behavior
 and help you diagnose issues more easily/precisely. One such monitoring
 tool is our SPM .

 Regards,
 Emir

 --
 Monitoring * Alerting * Anomaly Detection * Centralized Log Management
 Solr & Elasticsearch Support * http://sematext.com/

 On 10.08.2016 05:20, Midas A wrote:

> Thanks for replying
>
> index size:9GB
> 2000 docs/sec.
>
> Actually earlier it was taking less but suddenly it has increased .
>
> Currently we do not have any monitoring  tool.
>
> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
> emir.arnauto...@sematext.com> wrote:
>
> Hi Midas,
>>
>> Can you give us more details on your index: size, number of new docs
>> between commits. Why do you think 1.3s for commit is to much and why
>> do
>> you
>> need it to take less? Did you do any system/Solr monitoring?
>>
>> Emir
>>
>>
>> On 09.08.2016 14:10, Midas A wrote:
>>
>> please reply it is urgent.
>>>
>>> On Tue, Aug 9, 2016 at 11:17 AM, Midas A 
>>> wrote:
>>>
>>> Hi ,
>>>
 commit is taking more than 1300 ms . what should i check on server.

 below is my configuration .

  ${solr.autoCommit.maxTime:15000} <
 openSearcher>false  
 
 ${solr.autoSoftCommit.maxTime:-1} 



 --
>>
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>


Re: commit it taking 1300 ms

2016-08-11 Thread Emir Arnautovic

Hi Midas,

1. How many indexing threads?
2. Do you batch documents and what is your batch size?
3. How frequently do you commit?

I would recommend:
1. Move commits to Solr (set auto soft commit to max allowed time)
2. Use batches (bulks)
3. tune bulk size and number of threads to achieve max performance.

Thanks,
Emir


On 11.08.2016 08:21, Midas A wrote:

Emir,

other queries:

a) Solr cloud : NO
b) 
c)  
d) 
e) we are using multi threaded system.

On Thu, Aug 11, 2016 at 11:48 AM, Midas A  wrote:


Emir,

we post json documents through the curl it takes the time (same time i
would like to say that we are not hard committing ). that curl takes time
i.e. 1.3 sec.

On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:


Hi Midas,

According to your autocommit configuration and your worry about commit
time I assume that you are doing explicit commits from client code and that
1.3s is client observed commit time. If that is the case, than it might be
opening searcher that is taking time.

How do you index data - single threaded or multithreaded? How frequently
do you commit from client? Can you let Solr do soft commits instead of
explicitly committing? Do you have warmup queries? Is this SolrCloud? What
is number of servers (what spec), shards, docs?

In any case monitoring can give you more info about server/Solr behavior
and help you diagnose issues more easily/precisely. One such monitoring
tool is our SPM .

Regards,
Emir

--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On 10.08.2016 05:20, Midas A wrote:


Thanks for replying

index size:9GB
2000 docs/sec.

Actually earlier it was taking less but suddenly it has increased .

Currently we do not have any monitoring  tool.

On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

Hi Midas,

Can you give us more details on your index: size, number of new docs
between commits. Why do you think 1.3s for commit is to much and why do
you
need it to take less? Did you do any system/Solr monitoring?

Emir


On 09.08.2016 14:10, Midas A wrote:

please reply it is urgent.

On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:

Hi ,


commit is taking more than 1300 ms . what should i check on server.

below is my configuration .

 ${solr.autoCommit.maxTime:15000} <
openSearcher>false  

${solr.autoSoftCommit.maxTime:-1} 



--

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/





--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: commit it taking 1300 ms

2016-08-11 Thread Midas A
Emir,

other queries:

a) Solr cloud : NO
b) 
c)  
d) 
e) we are using multi threaded system.

On Thu, Aug 11, 2016 at 11:48 AM, Midas A  wrote:

> Emir,
>
> we post json documents through the curl it takes the time (same time i
> would like to say that we are not hard committing ). that curl takes time
> i.e. 1.3 sec.
>
> On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
> emir.arnauto...@sematext.com> wrote:
>
>> Hi Midas,
>>
>> According to your autocommit configuration and your worry about commit
>> time I assume that you are doing explicit commits from client code and that
>> 1.3s is client observed commit time. If that is the case, than it might be
>> opening searcher that is taking time.
>>
>> How do you index data - single threaded or multithreaded? How frequently
>> do you commit from client? Can you let Solr do soft commits instead of
>> explicitly committing? Do you have warmup queries? Is this SolrCloud? What
>> is number of servers (what spec), shards, docs?
>>
>> In any case monitoring can give you more info about server/Solr behavior
>> and help you diagnose issues more easily/precisely. One such monitoring
>> tool is our SPM .
>>
>> Regards,
>> Emir
>>
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>> On 10.08.2016 05:20, Midas A wrote:
>>
>>> Thanks for replying
>>>
>>> index size:9GB
>>> 2000 docs/sec.
>>>
>>> Actually earlier it was taking less but suddenly it has increased .
>>>
>>> Currently we do not have any monitoring  tool.
>>>
>>> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
>>> emir.arnauto...@sematext.com> wrote:
>>>
>>> Hi Midas,

 Can you give us more details on your index: size, number of new docs
 between commits. Why do you think 1.3s for commit is to much and why do
 you
 need it to take less? Did you do any system/Solr monitoring?

 Emir


 On 09.08.2016 14:10, Midas A wrote:

 please reply it is urgent.
>
> On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:
>
> Hi ,
>
>> commit is taking more than 1300 ms . what should i check on server.
>>
>> below is my configuration .
>>
>>  ${solr.autoCommit.maxTime:15000} <
>> openSearcher>false  
>> 
>> ${solr.autoSoftCommit.maxTime:-1} 
>>
>>
>>
>> --
 Monitoring * Alerting * Anomaly Detection * Centralized Log Management
 Solr & Elasticsearch Support * http://sematext.com/



>


Re: commit it taking 1300 ms

2016-08-11 Thread Midas A
Emir,

we post json documents through the curl it takes the time (same time i
would like to say that we are not hard committing ). that curl takes time
i.e. 1.3 sec.

On Wed, Aug 10, 2016 at 2:29 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Midas,
>
> According to your autocommit configuration and your worry about commit
> time I assume that you are doing explicit commits from client code and that
> 1.3s is client observed commit time. If that is the case, than it might be
> opening searcher that is taking time.
>
> How do you index data - single threaded or multithreaded? How frequently
> do you commit from client? Can you let Solr do soft commits instead of
> explicitly committing? Do you have warmup queries? Is this SolrCloud? What
> is number of servers (what spec), shards, docs?
>
> In any case monitoring can give you more info about server/Solr behavior
> and help you diagnose issues more easily/precisely. One such monitoring
> tool is our SPM .
>
> Regards,
> Emir
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
> On 10.08.2016 05:20, Midas A wrote:
>
>> Thanks for replying
>>
>> index size:9GB
>> 2000 docs/sec.
>>
>> Actually earlier it was taking less but suddenly it has increased .
>>
>> Currently we do not have any monitoring  tool.
>>
>> On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
>> emir.arnauto...@sematext.com> wrote:
>>
>> Hi Midas,
>>>
>>> Can you give us more details on your index: size, number of new docs
>>> between commits. Why do you think 1.3s for commit is to much and why do
>>> you
>>> need it to take less? Did you do any system/Solr monitoring?
>>>
>>> Emir
>>>
>>>
>>> On 09.08.2016 14:10, Midas A wrote:
>>>
>>> please reply it is urgent.

 On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:

 Hi ,

> commit is taking more than 1300 ms . what should i check on server.
>
> below is my configuration .
>
>  ${solr.autoCommit.maxTime:15000} <
> openSearcher>false  
> 
> ${solr.autoSoftCommit.maxTime:-1} 
>
>
>
> --
>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>>> Solr & Elasticsearch Support * http://sematext.com/
>>>
>>>
>>>


Re: commit it taking 1300 ms

2016-08-10 Thread Emir Arnautovic

Hi Midas,

According to your autocommit configuration and your worry about commit 
time I assume that you are doing explicit commits from client code and 
that 1.3s is client observed commit time. If that is the case, than it 
might be opening searcher that is taking time.


How do you index data - single threaded or multithreaded? How frequently 
do you commit from client? Can you let Solr do soft commits instead of 
explicitly committing? Do you have warmup queries? Is this SolrCloud? 
What is number of servers (what spec), shards, docs?


In any case monitoring can give you more info about server/Solr behavior 
and help you diagnose issues more easily/precisely. One such monitoring 
tool is our SPM .


Regards,
Emir

--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On 10.08.2016 05:20, Midas A wrote:

Thanks for replying

index size:9GB
2000 docs/sec.

Actually earlier it was taking less but suddenly it has increased .

Currently we do not have any monitoring  tool.

On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:


Hi Midas,

Can you give us more details on your index: size, number of new docs
between commits. Why do you think 1.3s for commit is to much and why do you
need it to take less? Did you do any system/Solr monitoring?

Emir


On 09.08.2016 14:10, Midas A wrote:


please reply it is urgent.

On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:

Hi ,

commit is taking more than 1300 ms . what should i check on server.

below is my configuration .

 ${solr.autoCommit.maxTime:15000} <
openSearcher>false  

${solr.autoSoftCommit.maxTime:-1} 




--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/




Re: commit it taking 1300 ms

2016-08-09 Thread Midas A
Thanks for replying

index size:9GB
2000 docs/sec.

Actually earlier it was taking less but suddenly it has increased .

Currently we do not have any monitoring  tool.

On Tue, Aug 9, 2016 at 7:00 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Midas,
>
> Can you give us more details on your index: size, number of new docs
> between commits. Why do you think 1.3s for commit is to much and why do you
> need it to take less? Did you do any system/Solr monitoring?
>
> Emir
>
>
> On 09.08.2016 14:10, Midas A wrote:
>
>> please reply it is urgent.
>>
>> On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:
>>
>> Hi ,
>>>
>>> commit is taking more than 1300 ms . what should i check on server.
>>>
>>> below is my configuration .
>>>
>>>  ${solr.autoCommit.maxTime:15000} <
>>> openSearcher>false  
>>> 
>>> ${solr.autoSoftCommit.maxTime:-1} 
>>>
>>>
>>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


Re: commit it taking 1300 ms

2016-08-09 Thread Emir Arnautovic

Hi Midas,

Can you give us more details on your index: size, number of new docs 
between commits. Why do you think 1.3s for commit is to much and why do 
you need it to take less? Did you do any system/Solr monitoring?


Emir

On 09.08.2016 14:10, Midas A wrote:

please reply it is urgent.

On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:


Hi ,

commit is taking more than 1300 ms . what should i check on server.

below is my configuration .

 ${solr.autoCommit.maxTime:15000} <
openSearcher>false   
${solr.autoSoftCommit.maxTime:-1} 




--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: commit it taking 1300 ms

2016-08-09 Thread Midas A
please reply it is urgent.

On Tue, Aug 9, 2016 at 11:17 AM, Midas A  wrote:

> Hi ,
>
> commit is taking more than 1300 ms . what should i check on server.
>
> below is my configuration .
>
>  ${solr.autoCommit.maxTime:15000} <
> openSearcher>false   
> ${solr.autoSoftCommit.maxTime:-1} 
>
>


Re: Commit (hard) at shutdown?

2016-05-23 Thread Per Steffensen
Sorry, I did not see the responses here because I found out myself. I 
definitely seems like a hard commit it performed when shutting down 
gracefully. The info I got from production was wrong.
It is not necessarily obvious that you will loose data on "kill -9". The 
tlog ought to save you, but it probably not 100% bulletproof.

We are not using the bin/solr script (yet)

On 21/05/16 04:02, Shawn Heisey wrote:

On 5/20/2016 2:51 PM, Jon Drews wrote:

I would be interested in an answer to this question.

 From my research it looks like it will do a hard commit if cleanly shut
down. However if you "kill -9" it you'll loose data (obviously). Perhaps
production isn't cleanly shutting down solr?
https://dzone.com/articles/understanding-solr-soft

I do not know whether a graceful shutdown does a hard commit or not.

I do know that all versions of Solr that utilize the bin/solr script are
configured by default to forcibly kill Solr only five seconds after the
graceful shutdown is requested.  Five seconds is usually not enough time
for production installs, so it needs to be increased.  The only way to
do this currently is to edit the bin/solr script directly.

Thanks,
Shawn






Re: Commit (hard) at shutdown?

2016-05-20 Thread Shawn Heisey
On 5/20/2016 2:51 PM, Jon Drews wrote:
> I would be interested in an answer to this question.
>
> From my research it looks like it will do a hard commit if cleanly shut
> down. However if you "kill -9" it you'll loose data (obviously). Perhaps
> production isn't cleanly shutting down solr?
> https://dzone.com/articles/understanding-solr-soft

I do not know whether a graceful shutdown does a hard commit or not.

I do know that all versions of Solr that utilize the bin/solr script are
configured by default to forcibly kill Solr only five seconds after the
graceful shutdown is requested.  Five seconds is usually not enough time
for production installs, so it needs to be increased.  The only way to
do this currently is to edit the bin/solr script directly.

Thanks,
Shawn



Re: Commit (hard) at shutdown?

2016-05-20 Thread Jon Drews
I would be interested in an answer to this question.

>From my research it looks like it will do a hard commit if cleanly shut
down. However if you "kill -9" it you'll loose data (obviously). Perhaps
production isn't cleanly shutting down solr?
https://dzone.com/articles/understanding-solr-soft

Jon Drews
jondrews.com

On Wed, May 18, 2016 at 3:55 AM, Per Steffensen  wrote:

> Hi
>
> Solr 5.1.
> Someone in production in my organization claims that even though Solrs are
> shut down gracefully, there can be huge tlogs to replay when starting Solrs
> again. We are doing heavy indexing right up until Solrs are shut down, and
> we have  set to 1 min. Can anyone confirm (or the opposite)
> that Solrs, upon graceful shutdown, OUGHT TO do a (hard) commit, leaving
> tlogs empty (= nothing to replay when starting again)?
>
> Regards, Per Steffensen
>


Re: Commit after every document - alternate approach

2016-03-04 Thread Shawn Heisey
On 3/3/2016 11:36 PM, sangs8788 wrote:
> When a commit fails, the document doesnt get cleared out from MQ and there is
> a task which runs in a background to republish the files to SOLR. If we do a
> batch commit we will not know we will end up redoing the same batch commit
> again. We currenlty have a client side commit which issue the command to
> SOLR. commit() returns a status code. If we are planning to use
> commitwithin(), I dont think it will actually return any result from solr
> since it is time oriented.

Do your indexing and commits in batches, as already recommended.  I'd
start with 1000 and go up or down from there as needed.  If the batch
indexing fails, or the commit fails, consider the entire batch failed. 
That may not be the end of the world, though -- if the indexing was
successful (and didn't use ConcurrentUpdateSolrClient), then those
updates will be stored in the Solr transaction log, and will be replayed
if Solr is restarted or the core is reloaded.

If you want to be absolutely certain that the update/commit succeeded by
verifying data, one thing you *could* do is send a batch update, do a
commit, and then request every document in the batch with a query that
includes a limited fl parameter, and verify that the document is present
and the values of the fields requested in the fl parameter are correct. 
I would probably do that query with {!cache=false} to avoid polluting
Solr's caches.

Almost every index update you make can be simply made again without
danger.  The exceptions are certain kinds of atomic updates, and certain
situations with deletes.  It's probably best to avoid doing those kinds
of updates, which are described below:

If you're doing atomic updates that increment or decrement a field
value, or atomic updates that add a new value to a multivalued field,
the results will be wrong if that update is repeated, although they
would be correct if the update is replayed from Solr's transaction log,
because atomic updates are no longer atomic when they hit the
transaction log -- they include values for every field in the document,
as if the document were built from scratch.

If you are explicitly deleting a document before replacing it, and those
actions were re-done in the opposite order, then the document would be
missing from the index.  Because Solr handles the deletion automatically
when a document is being updated/replaced, explicit deleting is not
recommended for those situations.

> If we go with SOLR autocommit, is there a way to send a response to MQ
> saying commit successful ?

If commits are completely automatic (autoCommit, autoSoftCommit, or
commitWithin), there's no way for a program to be sure that they have
completed.

The general recommendation for Solr indexing, especially if your
pipeline is multi-threaded, is to simply send your updates, let Solr
handle commits, and rely on the design of Lucene combined with Solr's
transaction logs to keep your data safe.  This approach does mean that
when things go wrong it may be a while before new data is searchable.

Emir's reply is spot on.  Solr is not recommended as a primary data store.

Thanks,
Shawn



Re: Commit after every document - alternate approach

2016-03-04 Thread Emir Arnautovic

Hi Sangeetha,
It seems to me that you are using Solr as primary data store? If that is 
true, you should not do that - you should have some other store that is 
transactional and can support what you are trying to do with Solr. If 
you are not using Solr as primary store, and it is critical to have Solr 
in sync, you can run periodical (about same frequency as Solr commits) 
checks that will ensure the latest data reached Solr.


Regards,
Emir

On 04.03.2016 05:46, sangs8788 wrote:

Hi Emir,

Right now we are having only inserts into SOLR. The main reason for having
commit after each document is to get a guarantee that the document has got
indexed in solr. Until the commit status is received back the document will
not be deleted from MQ. So that even if there is a commit failure the
document can be resent from MQ.

Thanks
Sangeetha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261575.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Commit after every document - alternate approach

2016-03-03 Thread sangs8788
When a commit fails, the document doesnt get cleared out from MQ and there is
a task which runs in a background to republish the files to SOLR. If we do a
batch commit we will not know we will end up redoing the same batch commit
again. We currenlty have a client side commit which issue the command to
SOLR. commit() returns a status code. If we are planning to use
commitwithin(), I dont think it will actually return any result from solr
since it is time oriented.

If we go with SOLR autocommit, is there a way to send a response to MQ
saying commit successful ?

Thanks
Sangeetha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261587.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit after every document - alternate approach

2016-03-03 Thread Walter Underwood
So batch them. You get a response back from Solr whether the document was 
accepted. If that fail, there is a failure. What do you do then?

After every 100 docs or one minute, do a commit. Then delete the documents from 
the input queue. What do you do when the commit fails?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Mar 3, 2016, at 8:46 PM, sangs8788  
> wrote:
> 
> Hi Emir,
> 
> Right now we are having only inserts into SOLR. The main reason for having
> commit after each document is to get a guarantee that the document has got
> indexed in solr. Until the commit status is received back the document will
> not be deleted from MQ. So that even if there is a commit failure the
> document can be resent from MQ.
> 
> Thanks
> Sangeetha
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261575.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit after every document - alternate approach

2016-03-03 Thread Walter Underwood
If you need transactions, you should use a different system, like MarkLogic.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Mar 3, 2016, at 8:46 PM, sangs8788  
> wrote:
> 
> Hi Emir,
> 
> Right now we are having only inserts into SOLR. The main reason for having
> commit after each document is to get a guarantee that the document has got
> indexed in solr. Until the commit status is received back the document will
> not be deleted from MQ. So that even if there is a commit failure the
> document can be resent from MQ.
> 
> Thanks
> Sangeetha
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261575.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit after every document - alternate approach

2016-03-03 Thread sangs8788
Hi Varun,

We dont have SOLR Cloud setup in our system. We have Master-Slave
architecture setup. In that case i dont see a way where SOLR can guarantee
whether a document got indexed/commited successfully or not.

Even thought about having a flag setup in db for whichever documents
commited to SOLR. But that also not feasible because it again requires a
return status from SOLR.

The other option is to run a dataimport periodically to verify if all the
documents got indexed.

Is there anyother option which i have missed out. Please let me know.

Thanks
Sangeetha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261576.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit after every document - alternate approach

2016-03-03 Thread sangs8788
Hi Emir,

Right now we are having only inserts into SOLR. The main reason for having
commit after each document is to get a guarantee that the document has got
indexed in solr. Until the commit status is received back the document will
not be deleted from MQ. So that even if there is a commit failure the
document can be resent from MQ.

Thanks
Sangeetha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-after-every-document-alternate-approach-tp4260946p4261575.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit after every document - alternate approach

2016-03-02 Thread Varun Thacker
Hi Sangeetha,

Well I don't think you need to commit after every document add.

You can rely on Solr's transaction log feature . If you are using SolrCloud
it's mandatory to have a transaction log . So every documents get written
to the tlog . Now say a node crashes even if documents were not committed ,
since it's present in the tlog Solr will replay then on startup.

Also if you are using SolrCloud and have multiple replicas , you should use
the min_rf feature to make sure that N replicas acknowledge the write
before you get back success -
https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance

On Wed, Mar 2, 2016 at 3:41 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Sangeetha,
> What is sure is that it is not going to work - with 200-300K doc/hour,
> there will be >50 commits/second, meaning there are <20ms time for
> doc+commit.
> You can do is let Solr handle commits and maybe use real time get to
> verify doc is in Solr or do some periodic sanity checks.
> Are you doing document updates so in order Solr updates are reason why you
> commit each doc before moving to next doc?
>
> Regards,
> Emir
>
>
> On 02.03.2016 09:06, sangeetha.subraman...@gtnexus.com wrote:
>
>> Hi All,
>>
>> I am trying to understand on how we can have commit issued to solr while
>> indexing documents. Around 200K to 300K document/per hour with an avg size
>> of 10 KB size each will be getting into SOLR . JAVA code fetches the
>> document from MQ and streamlines it to SOLR. The problem is the client code
>> issues hard-commit after each document which is sent to SOLR for indexing
>> and it waits for the response from SOLR to get assurance whether the
>> document got indexed successfully. Only if it gets a OK status from SOLR
>> the document is cleared out from SOLR.
>>
>> As far as I understand doing a commit after each document is an expensive
>> operation. But we need to make sure that all the documents which are put
>> into MQ gets indexed in SOLR. Is there any other way of getting this done ?
>> Please let me know.
>> If we do a batch indexing, is there any chances we can identify if some
>> documents is missed from indexing ?
>>
>> Thanks
>> Sangeetha
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


-- 


Regards,
Varun Thacker


Re: Commit after every document - alternate approach

2016-03-02 Thread Emir Arnautovic

Hi Sangeetha,
What is sure is that it is not going to work - with 200-300K doc/hour, 
there will be >50 commits/second, meaning there are <20ms time for 
doc+commit.
You can do is let Solr handle commits and maybe use real time get to 
verify doc is in Solr or do some periodic sanity checks.
Are you doing document updates so in order Solr updates are reason why 
you commit each doc before moving to next doc?


Regards,
Emir

On 02.03.2016 09:06, sangeetha.subraman...@gtnexus.com wrote:

Hi All,

I am trying to understand on how we can have commit issued to solr while 
indexing documents. Around 200K to 300K document/per hour with an avg size of 
10 KB size each will be getting into SOLR . JAVA code fetches the document from 
MQ and streamlines it to SOLR. The problem is the client code issues 
hard-commit after each document which is sent to SOLR for indexing and it waits 
for the response from SOLR to get assurance whether the document got indexed 
successfully. Only if it gets a OK status from SOLR the document is cleared out 
from SOLR.

As far as I understand doing a commit after each document is an expensive 
operation. But we need to make sure that all the documents which are put into 
MQ gets indexed in SOLR. Is there any other way of getting this done ? Please 
let me know.
If we do a batch indexing, is there any chances we can identify if some 
documents is missed from indexing ?

Thanks
Sangeetha



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Commit Error

2015-10-28 Thread Rallavagu

Thanks Shawn for the response.

Seeing very high CPU during this time and very high warmup times. During 
this time, there were plenty of these errors logged. So, trying to find 
out possible causes for this to occur. Could it be disk I/O issues or 
something else as it is related to commit (writing to disk).


On 10/28/15 3:57 PM, Shawn Heisey wrote:

On 10/28/2015 2:06 PM, Rallavagu wrote:

Solr 4.6.1, cloud

Seeing following commit errors.

[commitScheduler-19-thread-1] ERROR
org.apache.solr.update.CommitTracker – auto commit
error...:java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
java.util.concurrent.FutureTask.run(FutureTask.java:138) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
at java.lang.Thread.run(Thread.java:682)

Looking at the code,

public final void prepareCommit() throws IOException {
 ensureOpen();
 prepareCommitInternal();
   }

   private void prepareCommitInternal() throws IOException {
 synchronized(commitLock) {
   ensureOpen(false);
   if (infoStream.isEnabled("IW")) {
 infoStream.message("IW", "prepareCommit: flush");
 infoStream.message("IW", "  index before flush " + segString());
   }

   if (hitOOM) {
 throw new IllegalStateException("this writer hit an
OutOfMemoryError; cannot commit");
   }

It simply checking a flag if it hit OOM? What is making to check and
set the flag? What could be the conditions? Thanks.


This exception handling was revamped in Lucene 4.10.1 (and therefore in
Solr 4.10.1) by this issue:

https://issues.apache.org/jira/browse/LUCENE-5958

The "hitOOM" variable was removed by the following specific commit --
this is the commit on the 4.10 branch, but it was also committed to
branch_4x and trunk as well.  Later commits on this same issue were made
to branch_5x -- the cutover to begin the 5.0 release process was made
while this issue was still being fixed.

https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189=1626188=1626189

In the code before this fix, the hitOOM flag is set by other methods in
IndexWriter.  It is volatile to prevent problems with multiple threads
updating and accessing it.

Your message doesn't indicate what problems you're having besides an
error message in your log.  LUCENE-5958 indicates that the problems
could be as bad as a corrupt index.

The reason that IndexWriter swallows OOM exceptions is that this is the
only way Lucene can even *attempt* to avoid index corruption in every
error situation.  Lucene has had a very good track record at avoiding
index corruption, but every now and then a bug is found and a user
manages to get a corrupted index.

Thanks,
Shawn



Re: Commit Error

2015-10-28 Thread Shawn Heisey
On 10/28/2015 2:06 PM, Rallavagu wrote:
> Solr 4.6.1, cloud
>
> Seeing following commit errors.
>
> [commitScheduler-19-thread-1] ERROR
> org.apache.solr.update.CommitTracker – auto commit
> error...:java.lang.IllegalStateException: this writer hit an
> OutOfMemoryError; cannot commit at
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
> at
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
> java.util.concurrent.FutureTask.run(FutureTask.java:138) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> at java.lang.Thread.run(Thread.java:682)
>
> Looking at the code,
>
> public final void prepareCommit() throws IOException {
> ensureOpen();
> prepareCommitInternal();
>   }
>
>   private void prepareCommitInternal() throws IOException {
> synchronized(commitLock) {
>   ensureOpen(false);
>   if (infoStream.isEnabled("IW")) {
> infoStream.message("IW", "prepareCommit: flush");
> infoStream.message("IW", "  index before flush " + segString());
>   }
>
>   if (hitOOM) {
> throw new IllegalStateException("this writer hit an
> OutOfMemoryError; cannot commit");
>   }
>
> It simply checking a flag if it hit OOM? What is making to check and
> set the flag? What could be the conditions? Thanks.

This exception handling was revamped in Lucene 4.10.1 (and therefore in
Solr 4.10.1) by this issue:

https://issues.apache.org/jira/browse/LUCENE-5958

The "hitOOM" variable was removed by the following specific commit --
this is the commit on the 4.10 branch, but it was also committed to
branch_4x and trunk as well.  Later commits on this same issue were made
to branch_5x -- the cutover to begin the 5.0 release process was made
while this issue was still being fixed.

https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189=1626188=1626189

In the code before this fix, the hitOOM flag is set by other methods in
IndexWriter.  It is volatile to prevent problems with multiple threads
updating and accessing it.

Your message doesn't indicate what problems you're having besides an
error message in your log.  LUCENE-5958 indicates that the problems
could be as bad as a corrupt index.

The reason that IndexWriter swallows OOM exceptions is that this is the
only way Lucene can even *attempt* to avoid index corruption in every
error situation.  Lucene has had a very good track record at avoiding
index corruption, but every now and then a bug is found and a user
manages to get a corrupted index.

Thanks,
Shawn



Re: Commit Error

2015-10-28 Thread Rallavagu
Also, is this thread that went OOM and what could cause it? The heap was 
doing fine and server was live and running.


On 10/28/15 3:57 PM, Shawn Heisey wrote:

On 10/28/2015 2:06 PM, Rallavagu wrote:

Solr 4.6.1, cloud

Seeing following commit errors.

[commitScheduler-19-thread-1] ERROR
org.apache.solr.update.CommitTracker – auto commit
error...:java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
java.util.concurrent.FutureTask.run(FutureTask.java:138) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
at java.lang.Thread.run(Thread.java:682)

Looking at the code,

public final void prepareCommit() throws IOException {
 ensureOpen();
 prepareCommitInternal();
   }

   private void prepareCommitInternal() throws IOException {
 synchronized(commitLock) {
   ensureOpen(false);
   if (infoStream.isEnabled("IW")) {
 infoStream.message("IW", "prepareCommit: flush");
 infoStream.message("IW", "  index before flush " + segString());
   }

   if (hitOOM) {
 throw new IllegalStateException("this writer hit an
OutOfMemoryError; cannot commit");
   }

It simply checking a flag if it hit OOM? What is making to check and
set the flag? What could be the conditions? Thanks.


This exception handling was revamped in Lucene 4.10.1 (and therefore in
Solr 4.10.1) by this issue:

https://issues.apache.org/jira/browse/LUCENE-5958

The "hitOOM" variable was removed by the following specific commit --
this is the commit on the 4.10 branch, but it was also committed to
branch_4x and trunk as well.  Later commits on this same issue were made
to branch_5x -- the cutover to begin the 5.0 release process was made
while this issue was still being fixed.

https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189=1626188=1626189

In the code before this fix, the hitOOM flag is set by other methods in
IndexWriter.  It is volatile to prevent problems with multiple threads
updating and accessing it.

Your message doesn't indicate what problems you're having besides an
error message in your log.  LUCENE-5958 indicates that the problems
could be as bad as a corrupt index.

The reason that IndexWriter swallows OOM exceptions is that this is the
only way Lucene can even *attempt* to avoid index corruption in every
error situation.  Lucene has had a very good track record at avoiding
index corruption, but every now and then a bug is found and a user
manages to get a corrupted index.

Thanks,
Shawn



Re: Commit Error

2015-10-28 Thread Shawn Heisey
On 10/28/2015 5:11 PM, Rallavagu wrote:
> Seeing very high CPU during this time and very high warmup times. During
> this time, there were plenty of these errors logged. So, trying to find
> out possible causes for this to occur. Could it be disk I/O issues or
> something else as it is related to commit (writing to disk).

Lucene is claiming that you're hitting the Out Of Memory exception.  I
pulled down the 4.6.1 source code to verify IndexWriter's behavior.  The
only time hitOOM can be set to true is when OutOfMemoryError is being
thrown, so unless you're running Solr built from modified source code,
Lucene's claim *is* what's happening.

In OOM situations, there's a good chance that Java is going to be
spending a lot of time doing garbage collection, which can cause CPU
usage to go high and make warm times long.

The behavior of most Java programs is completely unpredictable when Java
actually runs out of memory.  As already mentioned, the parts of Lucene
that update the index are specifically programmed to deal with OOM
without causing index corruption.  Writing code that is predictable in
OOM situations is challenging, so only a subset of the code in
Lucene/Solr has been hardened in this way.  Most of it is as
unpredictable in OOM as any other Java program.

Thanks,
Shawn



Re: Commit Error

2015-10-28 Thread Rallavagu



On 10/28/15 5:41 PM, Shawn Heisey wrote:

On 10/28/2015 5:11 PM, Rallavagu wrote:

Seeing very high CPU during this time and very high warmup times. During
this time, there were plenty of these errors logged. So, trying to find
out possible causes for this to occur. Could it be disk I/O issues or
something else as it is related to commit (writing to disk).


Lucene is claiming that you're hitting the Out Of Memory exception.  I
pulled down the 4.6.1 source code to verify IndexWriter's behavior.  The
only time hitOOM can be set to true is when OutOfMemoryError is being
thrown, so unless you're running Solr built from modified source code,
Lucene's claim *is* what's happening.


This is very likely true as source is not modified.



In OOM situations, there's a good chance that Java is going to be
spending a lot of time doing garbage collection, which can cause CPU
usage to go high and make warm times long.


Again, I think this is the likely case. Even though there is no apparent 
OOM, JVM can throw OOM in case of excessive number full GC and unable to 
claim certain amount of memory.




The behavior of most Java programs is completely unpredictable when Java
actually runs out of memory.  As already mentioned, the parts of Lucene
that update the index are specifically programmed to deal with OOM
without causing index corruption.  Writing code that is predictable in
OOM situations is challenging, so only a subset of the code in
Lucene/Solr has been hardened in this way.  Most of it is as
unpredictable in OOM as any other Java program.


Thanks Shawn.



Thanks,
Shawn



Re: commit of xml update by AJAX

2015-08-30 Thread Upayavira


On Sat, Aug 29, 2015, at 05:30 PM, Szűcs Roland wrote:
 Hello SOLR experts,
 
 I am new to solr as you will see from my problem. I just try to
 understand
 how solr works. I use one core (BandW) on my locla machine and I use
 javascript for my learning purpose.
 
 I have a test schema.xml: with two fileds: id, title. I managed to run
 queries with faceting, autocomplete, etc. In all cases I used Ajax post
 method. For example my search was (searchWithSuggest.searchAjaxRequest is
 an XMLHttpRequest object):
 var s=document.getElementById(searchWithSuggest.inputBoxId).value;
 var params='q='+s+'start=0rows=10';
 a=searchWithSuggest.solrServer+'/query';
 searchWithSuggest.searchAjaxRequest.open(POST,a, true);
 searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
 application/x-www-form-urlencoded);
 searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
 It worked fine. I thought that an xml update can work the same way so I
 tried to add and index one new document by xml(a is an XMLHttpRequest
 object):
 a.open(POST,http://localhost:8983/solr/bandw/update,true);
 a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
 a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
 name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
 I got a response with error: missing content stream.
 
 I have changed only the a.open function call to this one:
 a.open(POST,http://localhost:8983/solr/bandw/update?commit=true,true);
 the rest of the did not change.
 Finally, I got response with no error from SOLR. Later it turned out that
 the new doc was not indexed at all.
 
 My questions:
 1. If I get no error from solr what is wrong with the second solution and
 how can I fix it?
 2. Is there any solution to put all the parameters to the a.send call as
 in
 case of queries. I tried
 a.send(encodeURIComponent(commit=truestream.body=add
 commitWithin=5000docfield name='id'3222/fieldfield
 name='title'Blade/field/doc/add)); but it was not working.
 3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
 the
 most efficient alternative? Is there a mapping between a curl syntax
 and
 the post request?
 
 Best Regards,
 Roland

You're using a POST to fake a GET - just make the Content-type text/xml
(or application/xml, I forget) and call a.send(add/add);

You may need the encodeURIComponent, not sure.

The stream.body feature allows you to do an HTTP GET that has a stream
within it, but you are already doing a POST so it isn't needed.

Upayavira


Re: commit of xml update by AJAX

2015-08-30 Thread Szűcs Roland
Thanks Erick,

Your blog post made it clear. It was looong, but not too long.

Roland

2015-08-29 19:00 GMT+02:00 Erick Erickson erickerick...@gmail.com:

 1 My first guess is that your autocommit
 section in solrconfig.xml has openSearcherfalse/openSearcher
 So the commitWithin happened but a new searcher
 was not opened thus the document is invisible.
 Try issuing a separate commit or change that value
 in solrconfig.xml and try again.

 Here's a lng post on all this:

 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 2 No clue since I'm pretty ajax-ignorant.

 3 because curl easily downloadable at worst and most often
 already on someone's machine and let people at least get started.
 Pretty soon, though, for production situations people will use SolrJ
 or the like or use one of the off-the-shelf tools packaged around
 Solr.

 Best
 Erick

 On Sat, Aug 29, 2015 at 9:30 AM, Szűcs Roland
 szucs.rol...@bookandwalk.hu wrote:
  Hello SOLR experts,
 
  I am new to solr as you will see from my problem. I just try to
 understand
  how solr works. I use one core (BandW) on my locla machine and I use
  javascript for my learning purpose.
 
  I have a test schema.xml: with two fileds: id, title. I managed to run
  queries with faceting, autocomplete, etc. In all cases I used Ajax post
  method. For example my search was (searchWithSuggest.searchAjaxRequest is
  an XMLHttpRequest object):
  var s=document.getElementById(searchWithSuggest.inputBoxId).value;
  var params='q='+s+'start=0rows=10';
  a=searchWithSuggest.solrServer+'/query';
  searchWithSuggest.searchAjaxRequest.open(POST,a, true);
  searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
  application/x-www-form-urlencoded);
  searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
  It worked fine. I thought that an xml update can work the same way so I
  tried to add and index one new document by xml(a is an XMLHttpRequest
  object):
  a.open(POST,http://localhost:8983/solr/bandw/update,true);
  a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
  a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
  name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
  I got a response with error: missing content stream.
 
  I have changed only the a.open function call to this one:
  a.open(POST,http://localhost:8983/solr/bandw/update?commit=true
 ,true);
  the rest of the did not change.
  Finally, I got response with no error from SOLR. Later it turned out that
  the new doc was not indexed at all.
 
  My questions:
  1. If I get no error from solr what is wrong with the second solution and
  how can I fix it?
  2. Is there any solution to put all the parameters to the a.send call as
 in
  case of queries. I tried
  a.send(encodeURIComponent(commit=truestream.body=add
  commitWithin=5000docfield name='id'3222/fieldfield
  name='title'Blade/field/doc/add)); but it was not working.
  3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
 the
  most efficient alternative? Is there a mapping between a curl syntax
 and
  the post request?
 
  Best Regards,
  Roland
 
  --
  https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs
 Roland
  https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
 Ismerkedjünk
  meg a Linkedin 
 https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
  -en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81
 13Bookandwalk.hu
  https://bokandwalk.hu/




-- 
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs Roland
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huIsmerkedjünk
meg a Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
-en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81 13Bookandwalk.hu
https://bokandwalk.hu/


Re: commit of xml update by AJAX

2015-08-30 Thread Szűcs Roland
Hi Upayavira,

You were rigtht. I had to only replace the Content-type to appliacation/xml
and it worked correctly.

Roland

2015-08-30 11:22 GMT+02:00 Upayavira u...@odoko.co.uk:



 On Sat, Aug 29, 2015, at 05:30 PM, Szűcs Roland wrote:
  Hello SOLR experts,
 
  I am new to solr as you will see from my problem. I just try to
  understand
  how solr works. I use one core (BandW) on my locla machine and I use
  javascript for my learning purpose.
 
  I have a test schema.xml: with two fileds: id, title. I managed to run
  queries with faceting, autocomplete, etc. In all cases I used Ajax post
  method. For example my search was (searchWithSuggest.searchAjaxRequest is
  an XMLHttpRequest object):
  var s=document.getElementById(searchWithSuggest.inputBoxId).value;
  var params='q='+s+'start=0rows=10';
  a=searchWithSuggest.solrServer+'/query';
  searchWithSuggest.searchAjaxRequest.open(POST,a, true);
  searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
  application/x-www-form-urlencoded);
  searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
  It worked fine. I thought that an xml update can work the same way so I
  tried to add and index one new document by xml(a is an XMLHttpRequest
  object):
  a.open(POST,http://localhost:8983/solr/bandw/update,true);
  a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
  a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
  name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
  I got a response with error: missing content stream.
 
  I have changed only the a.open function call to this one:
  a.open(POST,http://localhost:8983/solr/bandw/update?commit=true
 ,true);
  the rest of the did not change.
  Finally, I got response with no error from SOLR. Later it turned out that
  the new doc was not indexed at all.
 
  My questions:
  1. If I get no error from solr what is wrong with the second solution and
  how can I fix it?
  2. Is there any solution to put all the parameters to the a.send call as
  in
  case of queries. I tried
  a.send(encodeURIComponent(commit=truestream.body=add
  commitWithin=5000docfield name='id'3222/fieldfield
  name='title'Blade/field/doc/add)); but it was not working.
  3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
  the
  most efficient alternative? Is there a mapping between a curl syntax
  and
  the post request?
 
  Best Regards,
  Roland

 You're using a POST to fake a GET - just make the Content-type text/xml
 (or application/xml, I forget) and call a.send(add/add);

 You may need the encodeURIComponent, not sure.

 The stream.body feature allows you to do an HTTP GET that has a stream
 within it, but you are already doing a POST so it isn't needed.

 Upayavira




-- 
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs Roland
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huIsmerkedjünk
meg a Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
-en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81 13Bookandwalk.hu
https://bokandwalk.hu/


Re: commit on a spec shard with SolrCloud

2014-05-21 Thread Alexandre Rafalovitch
You can probably do a custom update request processor chain and skip
the distributed component. No idea of the consequences though.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, May 22, 2014 at 10:42 AM, YouPeng Yang
yypvsxf19870...@gmail.com wrote:
 Hi
   Doing DIH to one of shards in my SolrCloud Colleciton.I notice that every
 time do ing commit in the shard,all the other shards do commit too.
   I have check the source code DistributedUpdateProcessor.processCommit ,it
 said
 that processCommit would extend to all the shard in the collection.
   What I want to achieve is to do dih ,commit,update(including delete) in
 respective shard.
   Refer to DistributedUpdateProcessor.java: To DIH ,I need add the
 shards=myshard in the request .To the update,the _route_ or shard.keys will
 be use to achieve this, By the way,I have file JIRA about to make the
 action consist with the shards parameter instead of  _route_ or shard.keys 
 when
 doing update(including delete) .However ,to commit,there is no parameter
 used to  impose commit  only on a spec shard.
   So, I am thinking to make some changes to
 DistributedUpdateProcessor.processCommit to achieve only commit to a spec
 shard.
   Is there any suggestions about  what I am thinking?  Or is there any side
 effect along with the change?


Re: commit persistence guarantee

2014-05-16 Thread Erick Erickson
This is almost always that you're committing too often, either  soft
commit or hard commit with openSearcher=true. Shouldn't have any
effect on the consistency of your index though.

It _is_ making your Solr work harder than you want it to, so consider
increasing the commit intervals substantially. If you're indexing from
SolrJ, it's _not_ a good idea to commit except, perhaps, at the very
end of the run. Let your solrconfig settings commit for you. Super
especially if you're indexing form multiple SolrJ programs.

Best,
Erick

On Wed, May 7, 2014 at 3:02 AM, Alvaro Cabrerizo topor...@gmail.com wrote:
 Hi,

 Is there any guarantee that every document is persisted on disk during a
 commit avalanche that produces the: ERROR org.apache.solr.core.SolrCore
  – org.apache.solr.common.SolrException: Error opening new searcher. *exceeded
 limit of maxWarmingSearchers*=1, try again later.

 I've made some tests using jmeter to generate the situation and I
 *allways*get all the documents *well
 stored*, although having ~4% of requests with a 503 response, complaining
 with the previous message in the log.

 Regards.

 notes:  I know about NearRealTime and the possibility of modifying the
 commit strategy in order to be more polite with Solr ;)


Re: Commit Within and /update/extract handler

2014-04-09 Thread Jamie Johnson
This is being triggered by adding the commitWithin param to
ContentStreamUpdateRequest (request.setCommitWithin(1);).  My
configuration has autoCommit max time of 15s and openSearcher set to false.
 I'm assuming that changing openSeracher to true should address this, and
adding the softCommit = true to the request would make the documents
available in the mean time?

On Apr 8, 2014 10:02 AM, Erick Erickson erickerick...@gmail.com wrote:

 Got a clue how it's being generated? Because it's not going to show
 you documents.


 commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

 openSearcher=false and softCommit=false so the documents will be
 invisible. You need one or the other set to true.

 What it will do is close the current segment, open a new one and
 truncate the current transaction log. These may be good things but
 they have nothing to do with making docs visible :).

 See:

 http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 Best,
 Erick

 On Mon, Apr 7, 2014 at 8:43 PM, Jamie Johnson jej2...@gmail.com wrote:
  Below is the log showing what I believe to be the commit
 
  07-Apr-2014 23:40:55.846 INFO [catalina-exec-5]
  org.apache.solr.update.processor.LogUpdateProcessor.finish [forums]
  webapp=/solr path=/update/extract
 
 params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File
  literal.id
 =e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions}
  {add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563
  07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.DirectUpdateHandler2.commit start
 
 commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
  07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [IW][commitScheduler-10-thread-1]: commit: start
  07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [IW][commitScheduler-10-thread-1]: commit: enter lock
  07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [IW][commitScheduler-10-thread-1]: commit: now prepare
  07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [IW][commitScheduler-10-thread-1]: prepareCommit: flush
  07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [IW][commitScheduler-10-thread-1]:   index before flush _y(4.6):C1
  _10(4.6):C1 _11(4.6):C1 _12(4.6):C1
  07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1
  startFullFlush
  07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1
 deletes=true
  hasTickets:false pendingChangesInFullFlush: false
  07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWFC][commitScheduler-10-thread-1]: addFlushableState
  DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14,
  aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]]
  07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: flush postings as segment _14
 numDocs=1
  07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs
  07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms;
 no
  docValues; prox; freqs
  07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd,
  _14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm,
  _14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt]
  07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46
  07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
  org.apache.solr.update.LoggingInfoStream.message
  [DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122
 MB
  newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937
  07-Apr-2014 

Re: Commit Within and /update/extract handler

2014-04-09 Thread Shawn Heisey
On 4/9/2014 7:47 AM, Jamie Johnson wrote:
 This is being triggered by adding the commitWithin param to
 ContentStreamUpdateRequest (request.setCommitWithin(1);).  My
 configuration has autoCommit max time of 15s and openSearcher set to false.
  I'm assuming that changing openSeracher to true should address this, and
 adding the softCommit = true to the request would make the documents
 available in the mean time?

My personal opinion: autoCommit should not be used for document
visibility, even though it CAN be used for it.  It belongs in every
config that uses the transaction log, with openSearcher set to false,
and carefully considered maxTime and/or maxDocs parameters.

I think it's better to control document visibility entirely manually,
but if you actually do want to have an automatic commit for document
visibility, use autoSoftCommit.  It doesn't make any sense to disable
openSearcher on a soft commit, so just leave that out.  The docs/time
intervals for this can be smaller or greater than the intervals for
autoCommit, depending on your needs.

Any manual commits that you send probably should be soft commits, but
honestly that doesn't really matter if your auto settings are correct.

Thanks,
Shawn



Re: Commit Within and /update/extract handler

2014-04-09 Thread Jamie Johnson
Thanks Shawn, I appreciate the information.


On Wed, Apr 9, 2014 at 10:27 AM, Shawn Heisey s...@elyograg.org wrote:

 On 4/9/2014 7:47 AM, Jamie Johnson wrote:
  This is being triggered by adding the commitWithin param to
  ContentStreamUpdateRequest (request.setCommitWithin(1);).  My
  configuration has autoCommit max time of 15s and openSearcher set to
 false.
   I'm assuming that changing openSeracher to true should address this, and
  adding the softCommit = true to the request would make the documents
  available in the mean time?

 My personal opinion: autoCommit should not be used for document
 visibility, even though it CAN be used for it.  It belongs in every
 config that uses the transaction log, with openSearcher set to false,
 and carefully considered maxTime and/or maxDocs parameters.

 I think it's better to control document visibility entirely manually,
 but if you actually do want to have an automatic commit for document
 visibility, use autoSoftCommit.  It doesn't make any sense to disable
 openSearcher on a soft commit, so just leave that out.  The docs/time
 intervals for this can be smaller or greater than the intervals for
 autoCommit, depending on your needs.

 Any manual commits that you send probably should be soft commits, but
 honestly that doesn't really matter if your auto settings are correct.

 Thanks,
 Shawn




Re: Commit Within and /update/extract handler

2014-04-08 Thread Erick Erickson
Got a clue how it's being generated? Because it's not going to show
you documents.

commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

openSearcher=false and softCommit=false so the documents will be
invisible. You need one or the other set to true.

What it will do is close the current segment, open a new one and
truncate the current transaction log. These may be good things but
they have nothing to do with making docs visible :).

See:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Mon, Apr 7, 2014 at 8:43 PM, Jamie Johnson jej2...@gmail.com wrote:
 Below is the log showing what I believe to be the commit

 07-Apr-2014 23:40:55.846 INFO [catalina-exec-5]
 org.apache.solr.update.processor.LogUpdateProcessor.finish [forums]
 webapp=/solr path=/update/extract
 params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File
 literal.id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions}
 {add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563
 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.DirectUpdateHandler2.commit start
 commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]: commit: start
 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]: commit: enter lock
 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]: commit: now prepare
 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]: prepareCommit: flush
 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]:   index before flush _y(4.6):C1
 _10(4.6):C1 _11(4.6):C1 _12(4.6):C1
 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1
 startFullFlush
 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1 deletes=true
 hasTickets:false pendingChangesInFullFlush: false
 07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWFC][commitScheduler-10-thread-1]: addFlushableState
 DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14,
 aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]]
 07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: flush postings as segment _14 numDocs=1
 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs
 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms; no
 docValues; prox; freqs
 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd,
 _14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm,
 _14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt]
 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46
 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122 MB
 newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937
 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [DW][commitScheduler-10-thread-1]: publishFlushedSegment seg-private
 updates=null
 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [IW][commitScheduler-10-thread-1]: publishFlushedSegment
 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
 org.apache.solr.update.LoggingInfoStream.message
 [BD][commitScheduler-10-thread-1]: push deletes  1 deleted terms (unique
 

Re: Commit Within and /update/extract handler

2014-04-07 Thread Erick Erickson
You say you see the commit happen in the log, is openSearcher
specified? This sounds like you're somehow getting a commit
with openSearcher=false...

Best,
Erick

On Sun, Apr 6, 2014 at 5:37 PM, Jamie Johnson jej2...@gmail.com wrote:
 I'm running solr 4.6.0 and am noticing that commitWithin doesn't seem to
 work when I am using the /update/extract request handler.  It looks like a
 commit is happening from the logs, but the documents don't become available
 for search until I do a commit manually.  Could this be some type of
 configuration issue?


Re: Commit Within and /update/extract handler

2014-04-07 Thread Erick Erickson
What does the call look like? Are you setting opening a new searcher
or not? That should be in the log line where the commit is recorded...

FWIW,
Erick

On Sun, Apr 6, 2014 at 5:37 PM, Jamie Johnson jej2...@gmail.com wrote:
 I'm running solr 4.6.0 and am noticing that commitWithin doesn't seem to
 work when I am using the /update/extract request handler.  It looks like a
 commit is happening from the logs, but the documents don't become available
 for search until I do a commit manually.  Could this be some type of
 configuration issue?


Re: Commit Within and /update/extract handler

2014-04-07 Thread Jamie Johnson
Below is the log showing what I believe to be the commit

07-Apr-2014 23:40:55.846 INFO [catalina-exec-5]
org.apache.solr.update.processor.LogUpdateProcessor.finish [forums]
webapp=/solr path=/update/extract
params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File
literal.id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions}
{add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563
07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.DirectUpdateHandler2.commit start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: commit: start
07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: commit: enter lock
07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: commit: now prepare
07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: prepareCommit: flush
07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]:   index before flush _y(4.6):C1
_10(4.6):C1 _11(4.6):C1 _12(4.6):C1
07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1
startFullFlush
07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1 deletes=true
hasTickets:false pendingChangesInFullFlush: false
07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWFC][commitScheduler-10-thread-1]: addFlushableState
DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14,
aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]]
07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: flush postings as segment _14 numDocs=1
07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs
07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms; no
docValues; prox; freqs
07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd,
_14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm,
_14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt]
07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46
07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122 MB
newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937
07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[DW][commitScheduler-10-thread-1]: publishFlushedSegment seg-private
updates=null
07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: publishFlushedSegment
07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[BD][commitScheduler-10-thread-1]: push deletes  1 deleted terms (unique
count=1) bytesUsed=1024 delGen=4 packetCount=1 totBytesUsed=1024
07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IW][commitScheduler-10-thread-1]: publish sets newSegment delGen=5
seg=_14(4.6):C1
07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IFD][commitScheduler-10-thread-1]: now checkpoint _y(4.6):C1 _10(4.6):C1
_11(4.6):C1 _12(4.6):C1 _14(4.6):C1 [5 segments ; isCommit = false]
07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1]
org.apache.solr.update.LoggingInfoStream.message
[IFD][commitScheduler-10-thread-1]: 0 msec to checkpoint
07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1]

Re: commit=false in Solr update URL

2014-03-29 Thread Erick Erickson
This might be useful:

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Fri, Mar 28, 2014 at 3:55 PM, Joshi, Shital shital.jo...@gs.com wrote:
 Thank you!

 -Original Message-
 From: Shawn Heisey [mailto:s...@elyograg.org]
 Sent: Friday, March 28, 2014 3:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: commit=false in Solr update URL

 On 3/28/2014 1:02 PM, Joshi, Shital wrote:
 You mean default for openSearcher is false right? So unless I specify 
 commit=falseopenSearcher=true in my Solr Update URL the current searcher 
 and caches will not get invalidated.

 If commit=false, openSearcher does not matter -- it's part of a commit.

 When you actually *do* a commit, openSearcher defaults to true.  You
 have to set it to false if you don't want it to open a new searcher.

 Thanks,
 Shawn



Re: commit=false in Solr update URL

2014-03-28 Thread Shawn Heisey

On 3/28/2014 10:22 AM, Joshi, Shital wrote:

What happens when we use commit=false in Solr update URL?
http://$solr_url/solr/$solr_core/update/csv?commit=falseseparator=|trim=trueskipLines=2_shard_=$shardid


1.  Does it invalidate all caches? We really need to know this.

2.  Nothing happens to existing searcher, correct?

3.  All data gets written to translog, correct?


1) No.  A commit with openSearcher=true is required to invalidate caches.
1a) The default for openSearcher is true.

2) Correct.  See #1.

3) If the transaction log is enabled, everything gets written to it, 
regardless of commit parameters and auto settings.  A hard commit (with 
any openSearcher value) will close the current transaction log and start 
a new one.


It's my understanding that the default setting for the commit parameter 
is false unless you change it with the config or a request parameter.


Thanks,
Shawn



RE: commit=false in Solr update URL

2014-03-28 Thread Joshi, Shital
Thanks. 

You mean default for openSearcher is false right? So unless I specify 
commit=falseopenSearcher=true in my Solr Update URL the current searcher and 
caches will not get invalidated. 

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, March 28, 2014 12:48 PM
To: solr-user@lucene.apache.org
Subject: Re: commit=false in Solr update URL

On 3/28/2014 10:22 AM, Joshi, Shital wrote:
 What happens when we use commit=false in Solr update URL?
 http://$solr_url/solr/$solr_core/update/csv?commit=falseseparator=|trim=trueskipLines=2_shard_=$shardid


 1.  Does it invalidate all caches? We really need to know this.

 2.  Nothing happens to existing searcher, correct?

 3.  All data gets written to translog, correct?

1) No.  A commit with openSearcher=true is required to invalidate caches.
1a) The default for openSearcher is true.

2) Correct.  See #1.

3) If the transaction log is enabled, everything gets written to it, 
regardless of commit parameters and auto settings.  A hard commit (with 
any openSearcher value) will close the current transaction log and start 
a new one.

It's my understanding that the default setting for the commit parameter 
is false unless you change it with the config or a request parameter.

Thanks,
Shawn



Re: commit=false in Solr update URL

2014-03-28 Thread Shawn Heisey

On 3/28/2014 1:02 PM, Joshi, Shital wrote:

You mean default for openSearcher is false right? So unless I specify 
commit=falseopenSearcher=true in my Solr Update URL the current searcher and 
caches will not get invalidated.


If commit=false, openSearcher does not matter -- it's part of a commit.

When you actually *do* a commit, openSearcher defaults to true.  You 
have to set it to false if you don't want it to open a new searcher.


Thanks,
Shawn



RE: commit=false in Solr update URL

2014-03-28 Thread Joshi, Shital
Thank you!

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, March 28, 2014 3:14 PM
To: solr-user@lucene.apache.org
Subject: Re: commit=false in Solr update URL

On 3/28/2014 1:02 PM, Joshi, Shital wrote:
 You mean default for openSearcher is false right? So unless I specify 
 commit=falseopenSearcher=true in my Solr Update URL the current searcher and 
 caches will not get invalidated.

If commit=false, openSearcher does not matter -- it's part of a commit.

When you actually *do* a commit, openSearcher defaults to true.  You 
have to set it to false if you don't want it to open a new searcher.

Thanks,
Shawn



Re: Commit Issue in Solr 3.4

2014-02-08 Thread samarth s
Yes it is amazon ec2 indeed.

To expqnd on that,
This solr deployment was working fine, handling the same load, on a 34 GB
instance on ebs storage for quite some time. To reduce the time taken by a
commit, I shifted this to a 30 GB SSD instance. It performed better in
writes and commits for sure. But, since the last week I started facing this
problem of infinite back to back commits. Not being able to resolve this, I
have finally switched back to a 34 GB machine with ebs storage, and now the
commits are working fine, though slow.

Any thoughts?
On 6 Feb 2014 23:00, Shawn Heisey s...@elyograg.org wrote:

 On 2/6/2014 9:56 AM, samarth s wrote:
  Size of index = 260 GB
  Total Docs = 100mn
  Usual writing speed = 50K per hour
  autoCommit-maxDocs = 400,000
  autoCommit-maxTime = 1500,000 (25 mins)
  merge factor = 10
 
  M/c memory = 30 GB, Xmx = 20 GB
  Server - Jetty
  OS - Cent OS 6

 With 30GB of RAM (is it Amazon EC2, by chance?) and a 20GB heap, you
 have about 10GB of RAM left for caching your Solr index.  If that server
 has all 260GB of index, I am really surprised that you have only been
 having problems for a short time.  I would have expected problems from
 day one.  Even if it only has half or one quarter of the index, there is
 still a major discrepancy in RAM vs. index size.

 You either need more memory or you need to reduce the size of your
 index.  The size of the indexed portion generally has more of an impact
 on performance than the size of the stored portion, but they do both
 have an impact, especially on indexing and committing.  With regular
 disks, it's best to have at least 50% of your index size available to
 the OS disk cache, but 100% is better.

 http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

 If you are already using SSD, you might think there can't be
 memory-related performance problems ... but you still need a pretty
 significant chunk of disk cache.

 https://wiki.apache.org/solr/SolrPerformanceProblems#SSD

 Thanks,
 Shawn




Re: Commit Issue in Solr 3.4

2014-02-08 Thread Shawn Heisey
On 2/8/2014 1:40 AM, samarth s wrote:
 Yes it is amazon ec2 indeed.
 
 To expqnd on that,
 This solr deployment was working fine, handling the same load, on a 34 GB
 instance on ebs storage for quite some time. To reduce the time taken by a
 commit, I shifted this to a 30 GB SSD instance. It performed better in
 writes and commits for sure. But, since the last week I started facing this
 problem of infinite back to back commits. Not being able to resolve this, I
 have finally switched back to a 34 GB machine with ebs storage, and now the
 commits are working fine, though slow.

The extra 4GB of RAM is almost guaranteed to be the difference.  If your
index continues to grow, you'll probably be having problems very soon
even with 34GB of RAM.  If you could put it on a box with 128 to 256GB
of RAM, you'd likely see your performance increase dramatically.

Can you share your solrconfig.xml file?  I may be able to confirm a
couple of things I suspect, and depending on what's there, may be able
to offer some ideas to help a little bit.  It's best if you use a file
sharing site like dropbox - the list doesn't deal with attachments very
well.  Sometimes they work, but most of the time they don't.

I will reiterate my main point -- you really need a LOT more memory.
Another option is to shard your index across multiple servers.  This
doesn't actually reduce the TOTAL memory requirement, but it is
sometimes easier to get management to agree to buy more servers than it
is to get them to agree to buy really large servers.  It's a paradox
that doesn't make any sense to me, but I've seen it over and over.

Thanks,
Shawn



Re: Commit Issue in Solr 3.4

2014-02-08 Thread Shawn Heisey
On 2/8/2014 10:22 AM, Shawn Heisey wrote:
 Can you share your solrconfig.xml file?  I may be able to confirm a
 couple of things I suspect, and depending on what's there, may be able
 to offer some ideas to help a little bit.  It's best if you use a file
 sharing site like dropbox - the list doesn't deal with attachments very
 well.  Sometimes they work, but most of the time they don't.

One additional idea:  Unless you know through actual testing that you
really do need a 20GB heap, try reducing it.  You have 100 million
documents, so perhaps you really do need a heap that big.

In addition to the solrconfig.xml, I would also be interested in knowing
what memory/GC tuning options you've used to start your java instance,
and I'd like to see a sampling of typical and worst-case query
parameters or URL strings.  I'd need to see all parameters and know
which request handler you used, so I can cross-reference with the config.

Thanks,
Shawn



Re: Commit Issue in Solr 3.4

2014-02-08 Thread Roman Chyla
I would be curious what the cause is. Samarth says that it worked for over
a year /and supposedly docs were being added all the time/. Did the index
grew considerably in the last period? Perhaps he could attach visualvm
while it is in the 'black hole' state to see what is actually going on. I
don't know if the instance is used also for searching, but if its only
indexing, maybe just shorter commit intervals would alleviate the problem.
To add context, our indexer is configured with 16gb heap, on machine with
64gb ram, but busy one, so sometimes there is no cache to spare for os. The
index is 300gb (out of which 140gb stored values), and it is working just
'fine' - 30doc/s on average, but our docs are large /0.5mb on avg/ and
fetched from two databases, so the slowness is outside solr. I didnt see
big improvements with bigger heap, but I don't remember exact numbers. This
is solr4.

Roman
On 8 Feb 2014 12:23, Shawn Heisey s...@elyograg.org wrote:

 On 2/8/2014 1:40 AM, samarth s wrote:
  Yes it is amazon ec2 indeed.
 
  To expqnd on that,
  This solr deployment was working fine, handling the same load, on a 34 GB
  instance on ebs storage for quite some time. To reduce the time taken by
 a
  commit, I shifted this to a 30 GB SSD instance. It performed better in
  writes and commits for sure. But, since the last week I started facing
 this
  problem of infinite back to back commits. Not being able to resolve
 this, I
  have finally switched back to a 34 GB machine with ebs storage, and now
 the
  commits are working fine, though slow.

 The extra 4GB of RAM is almost guaranteed to be the difference.  If your
 index continues to grow, you'll probably be having problems very soon
 even with 34GB of RAM.  If you could put it on a box with 128 to 256GB
 of RAM, you'd likely see your performance increase dramatically.

 Can you share your solrconfig.xml file?  I may be able to confirm a
 couple of things I suspect, and depending on what's there, may be able
 to offer some ideas to help a little bit.  It's best if you use a file
 sharing site like dropbox - the list doesn't deal with attachments very
 well.  Sometimes they work, but most of the time they don't.

 I will reiterate my main point -- you really need a LOT more memory.
 Another option is to shard your index across multiple servers.  This
 doesn't actually reduce the TOTAL memory requirement, but it is
 sometimes easier to get management to agree to buy more servers than it
 is to get them to agree to buy really large servers.  It's a paradox
 that doesn't make any sense to me, but I've seen it over and over.

 Thanks,
 Shawn




Re: Commit Issue in Solr 3.4

2014-02-08 Thread Shawn Heisey
On 2/8/2014 11:02 AM, Roman Chyla wrote:
 I would be curious what the cause is. Samarth says that it worked for over
 a year /and supposedly docs were being added all the time/. Did the index
 grew considerably in the last period? Perhaps he could attach visualvm
 while it is in the 'black hole' state to see what is actually going on. I
 don't know if the instance is used also for searching, but if its only
 indexing, maybe just shorter commit intervals would alleviate the problem.
 To add context, our indexer is configured with 16gb heap, on machine with
 64gb ram, but busy one, so sometimes there is no cache to spare for os. The
 index is 300gb (out of which 140gb stored values), and it is working just
 'fine' - 30doc/s on average, but our docs are large /0.5mb on avg/ and
 fetched from two databases, so the slowness is outside solr. I didnt see
 big improvements with bigger heap, but I don't remember exact numbers. This
 is solr4.

For this discussion, refer to this image, or the Google Books link where
I originally found it:

https://dl.dropboxusercontent.com/u/97770508/performance-dropoff-graph.png

http://books.google.com/books?id=dUiNGYCiWg0Cpg=PA33#v=onepageqf=false

Computer systems have had a long history of performance curves like
this.  Everything goes really well, possibly for a really long time,
until you cross some threshold where a resource cannot keep up with the
demands being placed on it.  That threshold is usually something you
can't calculate in advance.  Once it is crossed, even by a tiny amount,
performance drops VERY quickly.

I do recommend that people closely analyze their GC characteristics, but
jconsole, jvisualvm, and other tools like that are actually not very
good at this task.  You can only get summary info -- how many GCs
occurred and total amount of time spent doing GC, often with a useless
granularity -- jconsole reports the time in minutes on a system that has
been running for any length of time.

I *was* having occasional super-long GC pauses (15 seconds or more), but
I did not know it, even though I had religiously looked at GC info in
jconsole and jstat.  I discovered the problem indirectly, and had to
find additional tools to quantify it.  After discovering it, I tuned my
garbage collection and have not had the problem since.

If you have detailed GC logs enabled, this is a good free tool for
offline analysis:

https://code.google.com/p/gclogviewer/

I have also had good results with this free tool, but it requires a
little more work to set up:

http://www.azulsystems.com/jHiccup

Azul Systems has an alternate Java implementation for Linux that
virtually eliminates GC pauses, but it isn't free.  I do not have any
information about how much it costs.  We found our own solution, but for
those who can throw money at the problem, I've heard good things about it.

Thanks,
Shawn



Re: Commit Issue in Solr 3.4

2014-02-08 Thread Roman Chyla
Thanks for the links. I think it would be worth getting more detailed info.
Because it could be the performance threshold, or it could be st else /such
as updated java version or st else, loosely related to ram, eg what is held
in memory before the commit, what is cached, leaked custom query objects
with holding to some big object etc/. Btw if i study the graph, i see that
there *are* warning signs. That's the point of testing/measuring after all,
IMHO.

--roman
On 8 Feb 2014 13:51, Shawn Heisey s...@elyograg.org wrote:

 On 2/8/2014 11:02 AM, Roman Chyla wrote:
  I would be curious what the cause is. Samarth says that it worked for
 over
  a year /and supposedly docs were being added all the time/. Did the index
  grew considerably in the last period? Perhaps he could attach visualvm
  while it is in the 'black hole' state to see what is actually going on. I
  don't know if the instance is used also for searching, but if its only
  indexing, maybe just shorter commit intervals would alleviate the
 problem.
  To add context, our indexer is configured with 16gb heap, on machine with
  64gb ram, but busy one, so sometimes there is no cache to spare for os.
 The
  index is 300gb (out of which 140gb stored values), and it is working just
  'fine' - 30doc/s on average, but our docs are large /0.5mb on avg/ and
  fetched from two databases, so the slowness is outside solr. I didnt see
  big improvements with bigger heap, but I don't remember exact numbers.
 This
  is solr4.

 For this discussion, refer to this image, or the Google Books link where
 I originally found it:

 https://dl.dropboxusercontent.com/u/97770508/performance-dropoff-graph.png

 http://books.google.com/books?id=dUiNGYCiWg0Cpg=PA33#v=onepageqf=false

 Computer systems have had a long history of performance curves like
 this.  Everything goes really well, possibly for a really long time,
 until you cross some threshold where a resource cannot keep up with the
 demands being placed on it.  That threshold is usually something you
 can't calculate in advance.  Once it is crossed, even by a tiny amount,
 performance drops VERY quickly.

 I do recommend that people closely analyze their GC characteristics, but
 jconsole, jvisualvm, and other tools like that are actually not very
 good at this task.  You can only get summary info -- how many GCs
 occurred and total amount of time spent doing GC, often with a useless
 granularity -- jconsole reports the time in minutes on a system that has
 been running for any length of time.

 I *was* having occasional super-long GC pauses (15 seconds or more), but
 I did not know it, even though I had religiously looked at GC info in
 jconsole and jstat.  I discovered the problem indirectly, and had to
 find additional tools to quantify it.  After discovering it, I tuned my
 garbage collection and have not had the problem since.

 If you have detailed GC logs enabled, this is a good free tool for
 offline analysis:

 https://code.google.com/p/gclogviewer/

 I have also had good results with this free tool, but it requires a
 little more work to set up:

 http://www.azulsystems.com/jHiccup

 Azul Systems has an alternate Java implementation for Linux that
 virtually eliminates GC pauses, but it isn't free.  I do not have any
 information about how much it costs.  We found our own solution, but for
 those who can throw money at the problem, I've heard good things about it.

 Thanks,
 Shawn




Re: Commit Issue in Solr 3.4

2014-02-06 Thread Shawn Heisey
On 2/6/2014 9:56 AM, samarth s wrote:
 Size of index = 260 GB
 Total Docs = 100mn
 Usual writing speed = 50K per hour
 autoCommit-maxDocs = 400,000
 autoCommit-maxTime = 1500,000 (25 mins)
 merge factor = 10
 
 M/c memory = 30 GB, Xmx = 20 GB
 Server - Jetty
 OS - Cent OS 6

With 30GB of RAM (is it Amazon EC2, by chance?) and a 20GB heap, you
have about 10GB of RAM left for caching your Solr index.  If that server
has all 260GB of index, I am really surprised that you have only been
having problems for a short time.  I would have expected problems from
day one.  Even if it only has half or one quarter of the index, there is
still a major discrepancy in RAM vs. index size.

You either need more memory or you need to reduce the size of your
index.  The size of the indexed portion generally has more of an impact
on performance than the size of the stored portion, but they do both
have an impact, especially on indexing and committing.  With regular
disks, it's best to have at least 50% of your index size available to
the OS disk cache, but 100% is better.

http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

If you are already using SSD, you might think there can't be
memory-related performance problems ... but you still need a pretty
significant chunk of disk cache.

https://wiki.apache.org/solr/SolrPerformanceProblems#SSD

Thanks,
Shawn



Re: Commit behaviour in SolrCloud

2013-11-24 Thread Mark Miller
SolrCloud does not use commits for update acceptance promises.

The idea is, if you get a success from the update, it’s in the system, commit 
or not.

Soft Commits are used for visibility only.

Standard Hard Commits are used essentially for internal purposes and should be 
done via auto commit generally.

To your question though - it is fine to send a commit while updates are coming 
in from another source - it’s just not generally necessary to do that anyway.

- Mark

On Nov 24, 2013, at 1:01 PM, adfel70 adfe...@gmail.com wrote:

 Hi everyone,
 
 I am wondering how commit operation works in SolrCloud:
 Say I have 2 parallel indexing processes. What if one process sends big
 update request (an add command with a lot of docs), and the other one just
 happens to send a commit command while the update request is being
 processed. 
 Is it possible that only part of the documents will be commited? 
 What will happen with the other docs? Is Solr transactional and promise that
 there will be no partial results?
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit behaviour in SolrCloud

2013-11-24 Thread Furkan KAMACI
I suggest you to read here:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks;
Furkan KAMACI


2013/11/24 Mark Miller markrmil...@gmail.com

 SolrCloud does not use commits for update acceptance promises.

 The idea is, if you get a success from the update, it’s in the system,
 commit or not.

 Soft Commits are used for visibility only.

 Standard Hard Commits are used essentially for internal purposes and
 should be done via auto commit generally.

 To your question though - it is fine to send a commit while updates are
 coming in from another source - it’s just not generally necessary to do
 that anyway.

 - Mark

 On Nov 24, 2013, at 1:01 PM, adfel70 adfe...@gmail.com wrote:

  Hi everyone,
 
  I am wondering how commit operation works in SolrCloud:
  Say I have 2 parallel indexing processes. What if one process sends big
  update request (an add command with a lot of docs), and the other one
 just
  happens to send a commit command while the update request is being
  processed.
  Is it possible that only part of the documents will be commited?
  What will happen with the other docs? Is Solr transactional and promise
 that
  there will be no partial results?
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879.html
  Sent from the Solr - User mailing list archive at Nabble.com.




Re: Commit behaviour in SolrCloud

2013-11-24 Thread adfel70
Hi Mark, Thanks for the answer.

One more question though: You say that if I get a success from the update,
it’s in the system, commit or not. But when exactly do I get this feedback -
Is it one feedback per the whole request, or per one add inside the request?
I will give an example clarify my question: Say I have new empty index, and
I repeatedly send indexing requests - every request adds 500 new documents
to the index. Is it possible that in some point during this process, to
query the index and get a total of 1,030 docs total? (Lets assume there were
no indexing errors got from Solr)

Thanks again.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102996.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit behaviour in SolrCloud

2013-11-24 Thread Mark Miller
If you want this promise and complete control, you pretty much need to do a doc 
per request and many parallel requests for speed.

The bulk and streaming methods of adding documents do not have a good fine 
grained error reporting strategy yet. It’s okay for certain use cases and and 
especially batch loading, and you will know when an update is rejected - it 
just might not be easy to know which in the batch / stream.

Documents that come in batches are added as they come / are processed - not in 
some atomic unit.

What controls how soon you will see documents or whether you will see them as 
they are still loading is simply when you soft commit and how many docs have 
been indexed when the soft commit happens.

- Mark

On Nov 25, 2013, at 1:03 AM, adfel70 adfe...@gmail.com wrote:

 Hi Mark, Thanks for the answer.
 
 One more question though: You say that if I get a success from the update,
 it’s in the system, commit or not. But when exactly do I get this feedback -
 Is it one feedback per the whole request, or per one add inside the request?
 I will give an example clarify my question: Say I have new empty index, and
 I repeatedly send indexing requests - every request adds 500 new documents
 to the index. Is it possible that in some point during this process, to
 query the index and get a total of 1,030 docs total? (Lets assume there were
 no indexing errors got from Solr)
 
 Thanks again.
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102996.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit behaviour in SolrCloud

2013-11-24 Thread adfel70
Just to clarify how these two phrases come together:
1. you will know when an update is rejected - it just might not be easy to
know which in the batch / stream

2. Documents that come in batches are added as they come / are processed -
not in some atomic unit.


If I send a batch of documents in one update request, and some of the docs
fail - will the other docs still remain in the system?
what if soft commit occurred after some of the docs but before all of the
docs got processed, and then some of the remaining docs fail during
processing?
I assume that the client will get an error for the whole batch (because of
the current error reporting strategy), but which docs will remain in the
system? only those which got processed before the fail or non of the docs in
this batch?




Mark Miller-3 wrote
 If you want this promise and complete control, you pretty much need to do
 a doc per request and many parallel requests for speed.
 
 The bulk and streaming methods of adding documents do not have a good fine
 grained error reporting strategy yet. It’s okay for certain use cases and
 and especially batch loading, and you will know when an update is rejected
 - it just might not be easy to know which in the batch / stream.
 
 Documents that come in batches are added as they come / are processed -
 not in some atomic unit.
 
 What controls how soon you will see documents or whether you will see them
 as they are still loading is simply when you soft commit and how many docs
 have been indexed when the soft commit happens.
 
 - Mark
 
 On Nov 25, 2013, at 1:03 AM, adfel70 lt;

 adfel70@

 gt; wrote:
 
 Hi Mark, Thanks for the answer.
 
 One more question though: You say that if I get a success from the
 update,
 it’s in the system, commit or not. But when exactly do I get this
 feedback -
 Is it one feedback per the whole request, or per one add inside the
 request?
 I will give an example clarify my question: Say I have new empty index,
 and
 I repeatedly send indexing requests - every request adds 500 new
 documents
 to the index. Is it possible that in some point during this process, to
 query the index and get a total of 1,030 docs total? (Lets assume there
 were
 no indexing errors got from Solr)
 
 Thanks again.
 
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102996.html
 Sent from the Solr - User mailing list archive at Nabble.com.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102999.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit behaviour in SolrCloud

2013-11-24 Thread Mark Miller

On Nov 25, 2013, at 1:40 AM, adfel70 adfe...@gmail.com wrote:

 Just to clarify how these two phrases come together:
 1. you will know when an update is rejected - it just might not be easy to
 know which in the batch / stream
 
 2. Documents that come in batches are added as they come / are processed -
 not in some atomic unit.
 
 
 If I send a batch of documents in one update request, and some of the docs
 fail - will the other docs still remain in the system?

Yes.

 what if soft commit occurred after some of the docs but before all of the
 docs got processed, and then some of the remaining docs fail during
 processing?

soft commit is only about visibility.

 I assume that the client will get an error for the whole batch (because of
 the current error reporting strategy), but which docs will remain in the
 system? only those which got processed before the fail or non of the docs in
 this batch?

Generally, it will be those processed before the fail if you are using the bulk 
add methods. Somewhat depends on impls and such - for example CloudSolrServer 
can use multiple threads to route documents and so perhaps a couple documents 
after the fail make it in.


- Mark

 
 
 
 
 Mark Miller-3 wrote
 If you want this promise and complete control, you pretty much need to do
 a doc per request and many parallel requests for speed.
 
 The bulk and streaming methods of adding documents do not have a good fine
 grained error reporting strategy yet. It’s okay for certain use cases and
 and especially batch loading, and you will know when an update is rejected
 - it just might not be easy to know which in the batch / stream.
 
 Documents that come in batches are added as they come / are processed -
 not in some atomic unit.
 
 What controls how soon you will see documents or whether you will see them
 as they are still loading is simply when you soft commit and how many docs
 have been indexed when the soft commit happens.
 
 - Mark
 
 On Nov 25, 2013, at 1:03 AM, adfel70 lt;
 
 adfel70@
 
 gt; wrote:
 
 Hi Mark, Thanks for the answer.
 
 One more question though: You say that if I get a success from the
 update,
 it’s in the system, commit or not. But when exactly do I get this
 feedback -
 Is it one feedback per the whole request, or per one add inside the
 request?
 I will give an example clarify my question: Say I have new empty index,
 and
 I repeatedly send indexing requests - every request adds 500 new
 documents
 to the index. Is it possible that in some point during this process, to
 query the index and get a total of 1,030 docs total? (Lets assume there
 were
 no indexing errors got from Solr)
 
 Thanks again.
 
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102996.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
 
 --
 View this message in 
 context:http://lucene.472066.n3.nabble.com/Commit-behaviour-in-SolrCloud-tp4102879p4102999.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit vs soft-commit

2013-08-11 Thread Shreejay Nair
Yes a new searcher is opened with every soft commit. It's still considered
faster because it does not write to the disk which is a slow IO operation
and might take a lot more time.

On Sunday, August 11, 2013, tamanjit.bin...@yahoo.co.in wrote:

 Hi,
 Some confusion in my head.
 http://
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
 http://
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
 
 says that
 /A soft commit is much faster since it only makes index changes visible and
 does not fsync index files or write a new index descriptor./

 So this means that even with every softcommit a new searcher opens right?
 If
 it does, isn't it still very heavy?




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
-- 
Shreejay Nair
Sent from my mobile device. Please excuse brevity and typos.


Re: commit vs soft-commit

2013-08-11 Thread Erick Erickson
Soft commits also do not rebuild certain per-segment caches
etc. It does invalidate the top level caches, including
the caches you configure in solrconfig.xml.

So no, it's not free at all. Your soft commits should still
be as long an interval as makes sense in your app. But
they're still much faster than hard commits with openSearcher
set to false.

Best
Erick




On Sun, Aug 11, 2013 at 11:00 AM, Shreejay Nair shreej...@gmail.com wrote:

 Yes a new searcher is opened with every soft commit. It's still considered
 faster because it does not write to the disk which is a slow IO operation
 and might take a lot more time.

 On Sunday, August 11, 2013, tamanjit.bin...@yahoo.co.in wrote:

  Hi,
  Some confusion in my head.
  http://
 
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
  http://
 
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
  
  says that
  /A soft commit is much faster since it only makes index changes visible
 and
  does not fsync index files or write a new index descriptor./
 
  So this means that even with every softcommit a new searcher opens right?
  If
  it does, isn't it still very heavy?
 
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
 --
 Shreejay Nair
 Sent from my mobile device. Please excuse brevity and typos.



Re: commit vs soft-commit

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
Erik-
/It does invalidate the top level caches, including the caches you
configure in solrconfig.xml. /

Could you elucidate?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817p4083844.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit vs soft-commit

2013-08-11 Thread Erick Erickson
Take a loot at solrconfig.xml. You configure filtrerCache,
documentCache, queryResultCache. These (and
some others I believe, but certainly these) are _not_
per-segment caches, so are invalidated on soft commit.
Any autowarming you've specified also gets executed
if applicable.

On the other hand, you specify short autocommit
intervals for tight NRT searching capabilities, so it's
likely these caches aren't being re-used all that much
anyway.

Best
Erick


On Sun, Aug 11, 2013 at 11:22 AM, tamanjit.bin...@yahoo.co.in 
tamanjit.bin...@yahoo.co.in wrote:

 Erik-
 /It does invalidate the top level caches, including the caches you
 configure in solrconfig.xml. /

 Could you elucidate?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817p4083844.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit different database rows to solr with same id value?

2013-07-11 Thread Erick Erickson
Just use the address in the url. You don't have to use the core name
if the defaults are set, which is usually collection1.

So it's something like http://host:port/solr/core2/update? blah blah blah

Erick

On Wed, Jul 10, 2013 at 4:17 PM, Jason Huang jason.hu...@icare.com wrote:
 Thanks David.

 I am actually trying to commit the database row on the fly, not DIH. :)

 Anyway, if I understand you correctly, basically you are suggesting to
 modify the value of the primary key and pass the new value to id before
 committing to solr. This could probably be one solution.

 What if I want to commit the data from table2 to a new core? Anyone knows
 how I can do that?

 thanks,

 Jason

 On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com wrote:

 Hi Jason,

 Assuming you're using DIH, why not build a new, unique id within the query
 to use as  the 'doc_id' for SOLR? We do something like this in one of our
 collections. In MySQL, try this (don't know what it would be for any other
 db but there must be equivalents):

 select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select
 @rownum:=0) s

 Regards,

 DQ

 -Original Message-
 From: Jason Huang [mailto:jason.hu...@icare.com]
 Sent: 10 July 2013 15:50
 To: solr-user@lucene.apache.org
 Subject: Commit different database rows to solr with same id value?

 Hello,

 I am trying to use Solr to store fields from two different database
 tables, where the primary keys are in the format of 1, 2, 3, 

 In Java, we build different POJO classes for these two database tables:

 table1.java

 @SolrIndex(name=id)

 private String idTable1

 


 table2.java

 @SolrIndex(name=id)

 private String idTable2



 And later we add these fields defined in the two different types of tables
 and commit it to solrServer.


 Here is the scenario where I am having issues:

 (1) commit a row from table1 with primary key = 3, this generates a
 document in Solr

 (2) commit another row from table2 with the same value of primary key =
 3, this overwrites the document generated in step (1).


 What we really want to achieve is to keep both rows in (1) and (2) because
 they are from different tables. I've read something from google search and
 it appears that we might be able to do it via keeping multiple cores in
 solr? Could anyone point at how to implement multiple core to achieve this?
 To be more specific, when I commit the row as a document, I don't have a
 place to pick a certain core and I am not sure if it makes any sense for me
 to specify a core when I commit the document since the layer I am working
 on should abstract it away from me.



 The second question is - if we don't want to do a multicore (since we
 can't easily search for related data between multiple cores), how can we
 resolve this issue so both rows from different database table which shares
 the same primary key still exist? We don't want to have to always change
 the primary key format to ensure a uniqueness of the primary key among all
 different types of database tables.


 thanks!


 Jason



Re: Commit different database rows to solr with same id value?

2013-07-11 Thread Jason Huang
cool.

so far I've been using the default collection 1 only.

thanks,

Jason

On Thu, Jul 11, 2013 at 7:57 AM, Erick Erickson erickerick...@gmail.comwrote:

 Just use the address in the url. You don't have to use the core name
 if the defaults are set, which is usually collection1.

 So it's something like http://host:port/solr/core2/update? blah blah blah

 Erick

 On Wed, Jul 10, 2013 at 4:17 PM, Jason Huang jason.hu...@icare.com
 wrote:
  Thanks David.
 
  I am actually trying to commit the database row on the fly, not DIH. :)
 
  Anyway, if I understand you correctly, basically you are suggesting to
  modify the value of the primary key and pass the new value to id before
  committing to solr. This could probably be one solution.
 
  What if I want to commit the data from table2 to a new core? Anyone knows
  how I can do that?
 
  thanks,
 
  Jason
 
  On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com
 wrote:
 
  Hi Jason,
 
  Assuming you're using DIH, why not build a new, unique id within the
 query
  to use as  the 'doc_id' for SOLR? We do something like this in one of
 our
  collections. In MySQL, try this (don't know what it would be for any
 other
  db but there must be equivalents):
 
  select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select
  @rownum:=0) s
 
  Regards,
 
  DQ
 
  -Original Message-
  From: Jason Huang [mailto:jason.hu...@icare.com]
  Sent: 10 July 2013 15:50
  To: solr-user@lucene.apache.org
  Subject: Commit different database rows to solr with same id value?
 
  Hello,
 
  I am trying to use Solr to store fields from two different database
  tables, where the primary keys are in the format of 1, 2, 3, 
 
  In Java, we build different POJO classes for these two database tables:
 
  table1.java
 
  @SolrIndex(name=id)
 
  private String idTable1
 
  
 
 
  table2.java
 
  @SolrIndex(name=id)
 
  private String idTable2
 
 
 
  And later we add these fields defined in the two different types of
 tables
  and commit it to solrServer.
 
 
  Here is the scenario where I am having issues:
 
  (1) commit a row from table1 with primary key = 3, this generates a
  document in Solr
 
  (2) commit another row from table2 with the same value of primary key =
  3, this overwrites the document generated in step (1).
 
 
  What we really want to achieve is to keep both rows in (1) and (2)
 because
  they are from different tables. I've read something from google search
 and
  it appears that we might be able to do it via keeping multiple cores in
  solr? Could anyone point at how to implement multiple core to achieve
 this?
  To be more specific, when I commit the row as a document, I don't have a
  place to pick a certain core and I am not sure if it makes any sense
 for me
  to specify a core when I commit the document since the layer I am
 working
  on should abstract it away from me.
 
 
 
  The second question is - if we don't want to do a multicore (since we
  can't easily search for related data between multiple cores), how can we
  resolve this issue so both rows from different database table which
 shares
  the same primary key still exist? We don't want to have to always change
  the primary key format to ensure a uniqueness of the primary key among
 all
  different types of database tables.
 
 
  thanks!
 
 
  Jason
 



RE: Commit different database rows to solr with same id value?

2013-07-10 Thread David Quarterman
Hi Jason,

Assuming you're using DIH, why not build a new, unique id within the query to 
use as  the 'doc_id' for SOLR? We do something like this in one of our 
collections. In MySQL, try this (don't know what it would be for any other db 
but there must be equivalents):

select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select 
@rownum:=0) s

Regards,

DQ

-Original Message-
From: Jason Huang [mailto:jason.hu...@icare.com] 
Sent: 10 July 2013 15:50
To: solr-user@lucene.apache.org
Subject: Commit different database rows to solr with same id value?

Hello,

I am trying to use Solr to store fields from two different database tables, 
where the primary keys are in the format of 1, 2, 3, 

In Java, we build different POJO classes for these two database tables:

table1.java

@SolrIndex(name=id)

private String idTable1




table2.java

@SolrIndex(name=id)

private String idTable2



And later we add these fields defined in the two different types of tables and 
commit it to solrServer.


Here is the scenario where I am having issues:

(1) commit a row from table1 with primary key = 3, this generates a document 
in Solr

(2) commit another row from table2 with the same value of primary key = 3, 
this overwrites the document generated in step (1).


What we really want to achieve is to keep both rows in (1) and (2) because they 
are from different tables. I've read something from google search and it 
appears that we might be able to do it via keeping multiple cores in solr? 
Could anyone point at how to implement multiple core to achieve this?
To be more specific, when I commit the row as a document, I don't have a place 
to pick a certain core and I am not sure if it makes any sense for me to 
specify a core when I commit the document since the layer I am working on 
should abstract it away from me.



The second question is - if we don't want to do a multicore (since we can't 
easily search for related data between multiple cores), how can we resolve this 
issue so both rows from different database table which shares the same primary 
key still exist? We don't want to have to always change the primary key format 
to ensure a uniqueness of the primary key among all different types of database 
tables.


thanks!


Jason


Re: Commit different database rows to solr with same id value?

2013-07-10 Thread Jason Huang
Thanks David.

I am actually trying to commit the database row on the fly, not DIH. :)

Anyway, if I understand you correctly, basically you are suggesting to
modify the value of the primary key and pass the new value to id before
committing to solr. This could probably be one solution.

What if I want to commit the data from table2 to a new core? Anyone knows
how I can do that?

thanks,

Jason

On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com wrote:

 Hi Jason,

 Assuming you're using DIH, why not build a new, unique id within the query
 to use as  the 'doc_id' for SOLR? We do something like this in one of our
 collections. In MySQL, try this (don't know what it would be for any other
 db but there must be equivalents):

 select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select
 @rownum:=0) s

 Regards,

 DQ

 -Original Message-
 From: Jason Huang [mailto:jason.hu...@icare.com]
 Sent: 10 July 2013 15:50
 To: solr-user@lucene.apache.org
 Subject: Commit different database rows to solr with same id value?

 Hello,

 I am trying to use Solr to store fields from two different database
 tables, where the primary keys are in the format of 1, 2, 3, 

 In Java, we build different POJO classes for these two database tables:

 table1.java

 @SolrIndex(name=id)

 private String idTable1

 


 table2.java

 @SolrIndex(name=id)

 private String idTable2



 And later we add these fields defined in the two different types of tables
 and commit it to solrServer.


 Here is the scenario where I am having issues:

 (1) commit a row from table1 with primary key = 3, this generates a
 document in Solr

 (2) commit another row from table2 with the same value of primary key =
 3, this overwrites the document generated in step (1).


 What we really want to achieve is to keep both rows in (1) and (2) because
 they are from different tables. I've read something from google search and
 it appears that we might be able to do it via keeping multiple cores in
 solr? Could anyone point at how to implement multiple core to achieve this?
 To be more specific, when I commit the row as a document, I don't have a
 place to pick a certain core and I am not sure if it makes any sense for me
 to specify a core when I commit the document since the layer I am working
 on should abstract it away from me.



 The second question is - if we don't want to do a multicore (since we
 can't easily search for related data between multiple cores), how can we
 resolve this issue so both rows from different database table which shares
 the same primary key still exist? We don't want to have to always change
 the primary key format to ensure a uniqueness of the primary key among all
 different types of database tables.


 thanks!


 Jason



Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi sandeep,

I made the changes u mentioned and tested again for the same set of docs but
unfortunately the commit time increased.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060622.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi Gopal,

I added the opensearcher parameter as mentioned by you but on checking logs
I found that apensearcher was still true on commit. it is only when I
removed the autosoftcommit parameter the opensearcher parameter worked and
provided faster updates as well. however I require soft commit in my
application.

Any suggestions.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060623.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread Sandeep Mestry
That's not ideal.
Can you post solrconfig.xml?
On 3 May 2013 07:41, vicky desai vicky.de...@germinait.com wrote:

 Hi sandeep,

 I made the changes u mentioned and tested again for the same set of docs
 but
 unfortunately the commit time increased.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060622.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
My solrconfig.xml is as follows

?xml version=1.0 encoding=UTF-8 ?
config
luceneMatchVersionLUCENE_40/luceneMatchVersion
indexConfig


maxFieldLength2147483647/maxFieldLength
lockTypesimple/lockType
unlockOnStartuptrue/unlockOnStartup
/indexConfig
updateHandler class=solr.DirectUpdateHandler2
autoSoftCommit
maxDocs500/maxDocs
maxTime1000/maxTime
/autoSoftCommit
autoCommit
maxDocs5/maxDocs 
maxTime30/maxTime 
openSearcherfalse/openSearcher
/autoCommit
/updateHandler

requestDispatcher handleSelect=true 
requestParsers enableRemoteStreaming=false
multipartUploadLimitInKB=204800 /
/requestDispatcher

requestHandler name=standard class=solr.StandardRequestHandler
default=true /
requestHandler name=/update class=solr.UpdateRequestHandler /
requestHandler name=/admin/
class=org.apache.solr.handler.admin.AdminHandlers /
requestHandler name=/replication class=solr.ReplicationHandler /
directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} / 
enableLazyFieldLoadingtrue/enableLazyFieldLoading
admin
defaultQuery*:*/defaultQuery
/admin 
/config



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060628.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi All,

setting opensearcher flag to true solution worked and it give me visible
improvement in commit time. One thing to make note of is that while using
solrj client we have to call server.commit(false,false) which i was doing
incorrectly and hence was not able to see the improvement earliear.

Thanks everyone



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060688.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi,

After using the following config

updateHandler class=solr.DirectUpdateHandler2
autoSoftCommit
maxDocs500/maxDocs
maxTime1000/maxTime
/autoSoftCommit
autoCommit
maxDocs5000/maxDocs 
openSearcherfalse/openSearcher
/autoCommit
/updateHandler

When a commit operation is fired I am getting the following logs 
INFO: start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

even though openSearcher is false , waitSearcher is true . Can that be set
to false too? Will that give a performance improvement and what is the
config for that  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060706.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread Gopal Patwa
Since you have define commit option as Auto Commit for hard and soft
commit then you don't have to explicitly call commit from SolrJ client. And
openSearcher=false for hard commit will make hard commit faster since it is
only makes sure that recent changes are flushed to disk (for durability)
 and not opening any searcher.

can you post you log when soft commit and hard commit happens?

You can read about waitFlush=false and waitSearcher=false which are default
to true, see below from  java doc

JavaDoc:
*waitFlush* block until index changes are flushed to disk
*waitSearcher* block until a new searcher is opened and registered as the
main query searcher, making the changes visible*T*


On Fri, May 3, 2013 at 7:19 AM, vicky desai vicky.de...@germinait.comwrote:

 Hi All,

 setting opensearcher flag to true solution worked and it give me visible
 improvement in commit time. One thing to make note of is that while using
 solrj client we have to call server.commit(false,false) which i was doing
 incorrectly and hence was not able to see the improvement earliear.

 Thanks everyone



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060688.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi,

When a auto commit operation is fired I am getting the following logs
INFO: start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

setting the openSearcher to false definetly gave me a lot of performance
improvement but  was wondering if waitSearcher can also be set to false and
will that give me a performance raise too. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060715.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-03 Thread Shawn Heisey

On 5/3/2013 9:28 AM, vicky desai wrote:

Hi,

When a auto commit operation is fired I am getting the following logs
INFO: start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

setting the openSearcher to false definetly gave me a lot of performance
improvement but  was wondering if waitSearcher can also be set to false and
will that give me a performance raise too.


The openSearcher parameter changes what actually happens when you do a 
hard commit, so using it can change your performance.


The wait parameters are for client software that does commits.  The 
idea is that if you don't want your client to wait for the commit to 
finish, you use these options so that the commit API call will return 
quickly and the server will finish the commit in the background.  It 
doesn't change what the commit does, it just allows the client to start 
doing other things.


With auto commits, the client and the server are both Solr, and 
everything is multi-threaded.  The wait parameters have no meaning, 
because there's no user software that has to wait.  There would be no 
performance gain from turning them off.


Side note: The waitFlush parameter was completely removed in Solr 4.0.

Thanks,
Shawn



Re: commit in solr4 takes a longer time

2013-05-02 Thread Furkan KAMACI
Can you explain more about your document size, shard and replica sizes, and
auto/soft commit time parameters?

2013/5/2 vicky desai vicky.de...@germinait.com

 Hi all,

 I have recently migrated from solr 3.6 to solr 4.0. The documents in my
 core
 are getting constantly updated and so I fire a code commit after every 10
 thousand docs . However moving from 3.6 to 4.0 I have noticed that for the
 same core size it takes about twice the time to commit in solr4.0 compared
 to solr 3.6.

 Is there any workaround by which I can reduce this time. Any help would be
 highly appreciated



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit in solr4 takes a longer time

2013-05-02 Thread vicky desai
Hi,

I am using 1 shard and two replicas. Document size is around 6 lakhs 


My solrconfig.xml is as follows
?xml version=1.0 encoding=UTF-8 ?
config
luceneMatchVersionLUCENE_40/luceneMatchVersion
indexConfig


maxFieldLength2147483647/maxFieldLength
lockTypesimple/lockType
unlockOnStartuptrue/unlockOnStartup
/indexConfig
updateHandler class=solr.DirectUpdateHandler2
autoSoftCommit
maxDocs500/maxDocs
maxTime1000/maxTime
/autoSoftCommit
autoCommit
maxDocs5/maxDocs 
maxTime30/maxTime 
/autoCommit
/updateHandler

requestDispatcher handleSelect=true 
requestParsers enableRemoteStreaming=false
multipartUploadLimitInKB=204800 /
/requestDispatcher

requestHandler name=standard class=solr.StandardRequestHandler
default=true /
requestHandler name=/update class=solr.UpdateRequestHandler /
requestHandler name=/admin/
class=org.apache.solr.handler.admin.AdminHandlers /
requestHandler name=/replication class=solr.ReplicationHandler /
directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} / 
enableLazyFieldLoadingtrue/enableLazyFieldLoading
admin
defaultQuery*:*/defaultQuery
/admin 
/config




--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit in solr4 takes a longer time

2013-05-02 Thread Walter Underwood
First, I would upgrade to 4.2.1 and remember to change luceneMatchVersion to 
LUCENE_42.

There were a LOT of fixes between 4.0 and 4.2.1.

wunder

On May 2, 2013, at 12:16 AM, vicky desai wrote:

 Hi,
 
 I am using 1 shard and two replicas. Document size is around 6 lakhs 
 
 
 My solrconfig.xml is as follows
 ?xml version=1.0 encoding=UTF-8 ?
 config
   luceneMatchVersionLUCENE_40/luceneMatchVersion
   indexConfig
 
   
   maxFieldLength2147483647/maxFieldLength
   lockTypesimple/lockType
   unlockOnStartuptrue/unlockOnStartup
   /indexConfig
   updateHandler class=solr.DirectUpdateHandler2
   autoSoftCommit
   maxDocs500/maxDocs
   maxTime1000/maxTime
   /autoSoftCommit
   autoCommit
   maxDocs5/maxDocs 
   maxTime30/maxTime 
   /autoCommit
   /updateHandler
 
   requestDispatcher handleSelect=true 
   requestParsers enableRemoteStreaming=false
 multipartUploadLimitInKB=204800 /
   /requestDispatcher
 
   requestHandler name=standard class=solr.StandardRequestHandler
 default=true /
   requestHandler name=/update class=solr.UpdateRequestHandler /
   requestHandler name=/admin/
 class=org.apache.solr.handler.admin.AdminHandlers /
   requestHandler name=/replication class=solr.ReplicationHandler /
   directoryFactory name=DirectoryFactory
 class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} / 
   enableLazyFieldLoadingtrue/enableLazyFieldLoading
   admin
   defaultQuery*:*/defaultQuery
   /admin 
 /config
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
 Sent from the Solr - User mailing list archive at Nabble.com.

--
Walter Underwood
wun...@wunderwood.org





Re: commit in solr4 takes a longer time

2013-05-02 Thread Gopal Patwa
you might want to added openSearcher=false for hard commit, so hard commit
also act like soft commit

   autoCommit
maxDocs5/maxDocs
maxTime30/maxTime
   openSearcherfalse/openSearcher
/autoCommit



On Thu, May 2, 2013 at 12:16 AM, vicky desai vicky.de...@germinait.comwrote:

 Hi,

 I am using 1 shard and two replicas. Document size is around 6 lakhs


 My solrconfig.xml is as follows
 ?xml version=1.0 encoding=UTF-8 ?
 config
 luceneMatchVersionLUCENE_40/luceneMatchVersion
 indexConfig


 maxFieldLength2147483647/maxFieldLength
 lockTypesimple/lockType
 unlockOnStartuptrue/unlockOnStartup
 /indexConfig
 updateHandler class=solr.DirectUpdateHandler2
 autoSoftCommit
 maxDocs500/maxDocs
 maxTime1000/maxTime
 /autoSoftCommit
 autoCommit
 maxDocs5/maxDocs
 maxTime30/maxTime
 /autoCommit
 /updateHandler

 requestDispatcher handleSelect=true 
 requestParsers enableRemoteStreaming=false
 multipartUploadLimitInKB=204800 /
 /requestDispatcher

 requestHandler name=standard class=solr.StandardRequestHandler
 default=true /
 requestHandler name=/update class=solr.UpdateRequestHandler /
 requestHandler name=/admin/
 class=org.apache.solr.handler.admin.AdminHandlers /
 requestHandler name=/replication
 class=solr.ReplicationHandler /
 directoryFactory name=DirectoryFactory
 class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} /
 enableLazyFieldLoadingtrue/enableLazyFieldLoading
 admin
 defaultQuery*:*/defaultQuery
 /admin
 /config




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit in solr4 takes a longer time

2013-05-02 Thread Furkan KAMACI
What happens exactly when you don't open searcher at commit?

2013/5/2 Gopal Patwa gopalpa...@gmail.com

 you might want to added openSearcher=false for hard commit, so hard commit
 also act like soft commit

autoCommit
 maxDocs5/maxDocs
 maxTime30/maxTime
openSearcherfalse/openSearcher
 /autoCommit



 On Thu, May 2, 2013 at 12:16 AM, vicky desai vicky.de...@germinait.com
 wrote:

  Hi,
 
  I am using 1 shard and two replicas. Document size is around 6 lakhs
 
 
  My solrconfig.xml is as follows
  ?xml version=1.0 encoding=UTF-8 ?
  config
  luceneMatchVersionLUCENE_40/luceneMatchVersion
  indexConfig
 
 
  maxFieldLength2147483647/maxFieldLength
  lockTypesimple/lockType
  unlockOnStartuptrue/unlockOnStartup
  /indexConfig
  updateHandler class=solr.DirectUpdateHandler2
  autoSoftCommit
  maxDocs500/maxDocs
  maxTime1000/maxTime
  /autoSoftCommit
  autoCommit
  maxDocs5/maxDocs
  maxTime30/maxTime
  /autoCommit
  /updateHandler
 
  requestDispatcher handleSelect=true 
  requestParsers enableRemoteStreaming=false
  multipartUploadLimitInKB=204800 /
  /requestDispatcher
 
  requestHandler name=standard
 class=solr.StandardRequestHandler
  default=true /
  requestHandler name=/update class=solr.UpdateRequestHandler
 /
  requestHandler name=/admin/
  class=org.apache.solr.handler.admin.AdminHandlers /
  requestHandler name=/replication
  class=solr.ReplicationHandler /
  directoryFactory name=DirectoryFactory
  class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} /
  enableLazyFieldLoadingtrue/enableLazyFieldLoading
  admin
  defaultQuery*:*/defaultQuery
  /admin
  /config
 
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: commit in solr4 takes a longer time

2013-05-02 Thread Alexandre Rafalovitch
If you don't re-open the searcher, you will not see new changes. So,
if you only have hard commit, you never see those changes (until
restart). But if you also have soft commit enabled, that will re-open
your searcher for you.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 2, 2013 at 11:21 AM, Furkan KAMACI furkankam...@gmail.com wrote:
 What happens exactly when you don't open searcher at commit?

 2013/5/2 Gopal Patwa gopalpa...@gmail.com

 you might want to added openSearcher=false for hard commit, so hard commit
 also act like soft commit

autoCommit
 maxDocs5/maxDocs
 maxTime30/maxTime
openSearcherfalse/openSearcher
 /autoCommit



 On Thu, May 2, 2013 at 12:16 AM, vicky desai vicky.de...@germinait.com
 wrote:

  Hi,
 
  I am using 1 shard and two replicas. Document size is around 6 lakhs
 
 
  My solrconfig.xml is as follows
  ?xml version=1.0 encoding=UTF-8 ?
  config
  luceneMatchVersionLUCENE_40/luceneMatchVersion
  indexConfig
 
 
  maxFieldLength2147483647/maxFieldLength
  lockTypesimple/lockType
  unlockOnStartuptrue/unlockOnStartup
  /indexConfig
  updateHandler class=solr.DirectUpdateHandler2
  autoSoftCommit
  maxDocs500/maxDocs
  maxTime1000/maxTime
  /autoSoftCommit
  autoCommit
  maxDocs5/maxDocs
  maxTime30/maxTime
  /autoCommit
  /updateHandler
 
  requestDispatcher handleSelect=true 
  requestParsers enableRemoteStreaming=false
  multipartUploadLimitInKB=204800 /
  /requestDispatcher
 
  requestHandler name=standard
 class=solr.StandardRequestHandler
  default=true /
  requestHandler name=/update class=solr.UpdateRequestHandler
 /
  requestHandler name=/admin/
  class=org.apache.solr.handler.admin.AdminHandlers /
  requestHandler name=/replication
  class=solr.ReplicationHandler /
  directoryFactory name=DirectoryFactory
  class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} /
  enableLazyFieldLoadingtrue/enableLazyFieldLoading
  admin
  defaultQuery*:*/defaultQuery
  /admin
  /config
 
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: commit in solr4 takes a longer time

2013-05-02 Thread Sandeep Mestry
Hi Vicky,

I faced this issue as well and after some playing around I found the
autowarm count in cache sizes to be a problem.
I changed that from a fixed count (3072) to percentage (10%) and all commit
times were stable then onwards.

filterCache class=solr.FastLRUCache size=8192 initialSize=3072
autowarmCount=10% /
queryResultCache class=solr.LRUCache size=16384 initialSize=3072
autowarmCount=10% /
documentCache class=solr.LRUCache size=8192 initialSize=4096
autowarmCount=10% /

HTH,
Sandeep


On 2 May 2013 16:31, Alexandre Rafalovitch arafa...@gmail.com wrote:

 If you don't re-open the searcher, you will not see new changes. So,
 if you only have hard commit, you never see those changes (until
 restart). But if you also have soft commit enabled, that will re-open
 your searcher for you.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Thu, May 2, 2013 at 11:21 AM, Furkan KAMACI furkankam...@gmail.com
 wrote:
  What happens exactly when you don't open searcher at commit?
 
  2013/5/2 Gopal Patwa gopalpa...@gmail.com
 
  you might want to added openSearcher=false for hard commit, so hard
 commit
  also act like soft commit
 
 autoCommit
  maxDocs5/maxDocs
  maxTime30/maxTime
 openSearcherfalse/openSearcher
  /autoCommit
 
 
 
  On Thu, May 2, 2013 at 12:16 AM, vicky desai vicky.de...@germinait.com
  wrote:
 
   Hi,
  
   I am using 1 shard and two replicas. Document size is around 6 lakhs
  
  
   My solrconfig.xml is as follows
   ?xml version=1.0 encoding=UTF-8 ?
   config
   luceneMatchVersionLUCENE_40/luceneMatchVersion
   indexConfig
  
  
   maxFieldLength2147483647/maxFieldLength
   lockTypesimple/lockType
   unlockOnStartuptrue/unlockOnStartup
   /indexConfig
   updateHandler class=solr.DirectUpdateHandler2
   autoSoftCommit
   maxDocs500/maxDocs
   maxTime1000/maxTime
   /autoSoftCommit
   autoCommit
   maxDocs5/maxDocs
   maxTime30/maxTime
   /autoCommit
   /updateHandler
  
   requestDispatcher handleSelect=true 
   requestParsers enableRemoteStreaming=false
   multipartUploadLimitInKB=204800 /
   /requestDispatcher
  
   requestHandler name=standard
  class=solr.StandardRequestHandler
   default=true /
   requestHandler name=/update
 class=solr.UpdateRequestHandler
  /
   requestHandler name=/admin/
   class=org.apache.solr.handler.admin.AdminHandlers /
   requestHandler name=/replication
   class=solr.ReplicationHandler /
   directoryFactory name=DirectoryFactory
   class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory} /
   enableLazyFieldLoadingtrue/enableLazyFieldLoading
   admin
   defaultQuery*:*/defaultQuery
   /admin
   /config
  
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 



Re: commit

2013-03-13 Thread Upayavira
Auto commit would seem a good idea, as you don't want your independent
worker threads issuing overlapping commits. There's also commtWithin
that achieves the same thing.

Upayavira

On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote:
 Hi
 
 I'm filling our solr database with about 5mil docs. All docs are in some 
 kind of queue which are processed by 5 simultaneous workers. What is the 
 best way to do commits is such a situation? If I say to let every worker 
 do a commit after 100 docs there will be 5 commits in a short period. Or 
 should I use the autocommit option for this?
 
 Thx!
 
 Arkadi


Re: commit

2013-03-13 Thread Arkadi Colson
What would be a good value for maxTime or maxDocs knowing that we insert 
about 10 docs/sec? Will it be a problem that we only use maxDocs = 1 
because it's not searchable yet...


On 03/13/2013 10:00 AM, Upayavira wrote:

Auto commit would seem a good idea, as you don't want your independent
worker threads issuing overlapping commits. There's also commtWithin
that achieves the same thing.

Upayavira

On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote:

Hi

I'm filling our solr database with about 5mil docs. All docs are in some
kind of queue which are processed by 5 simultaneous workers. What is the
best way to do commits is such a situation? If I say to let every worker
do a commit after 100 docs there will be 5 commits in a short period. Or
should I use the autocommit option for this?

Thx!

Arkadi






Re: commit

2013-03-13 Thread Upayavira
It depends whether you are using soft commits - that changes things a
lot.

If you aren't, then you should look in the admin interface, and see how
long it takes to warm your index, and commit at least less frequently
than that (commit more often, and you'll have concurrent warming
searchers which will use up a lot of your memory).

If you are, then the commit frequency becomes less important. You could
use soft commits between 1s and 15s, and hard commits maybe every 15s to
1min. Those seem to me to be reasonable values.

Upayavira

On Wed, Mar 13, 2013, at 09:19 AM, Arkadi Colson wrote:
 What would be a good value for maxTime or maxDocs knowing that we insert 
 about 10 docs/sec? Will it be a problem that we only use maxDocs = 1 
 because it's not searchable yet...
 
 On 03/13/2013 10:00 AM, Upayavira wrote:
  Auto commit would seem a good idea, as you don't want your independent
  worker threads issuing overlapping commits. There's also commtWithin
  that achieves the same thing.
 
  Upayavira
 
  On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote:
  Hi
 
  I'm filling our solr database with about 5mil docs. All docs are in some
  kind of queue which are processed by 5 simultaneous workers. What is the
  best way to do commits is such a situation? If I say to let every worker
  do a commit after 100 docs there will be 5 commits in a short period. Or
  should I use the autocommit option for this?
 
  Thx!
 
  Arkadi
 
 


Re: commit

2013-03-13 Thread Arkadi Colson
Sorry I'm quite new to solr but where exactly in the admin interface can 
I find how long it takes to warm the index?


Arkadi

On 03/13/2013 11:19 AM, Upayavira wrote:

It depends whether you are using soft commits - that changes things a
lot.

If you aren't, then you should look in the admin interface, and see how
long it takes to warm your index, and commit at least less frequently
than that (commit more often, and you'll have concurrent warming
searchers which will use up a lot of your memory).

If you are, then the commit frequency becomes less important. You could
use soft commits between 1s and 15s, and hard commits maybe every 15s to
1min. Those seem to me to be reasonable values.

Upayavira

On Wed, Mar 13, 2013, at 09:19 AM, Arkadi Colson wrote:

What would be a good value for maxTime or maxDocs knowing that we insert
about 10 docs/sec? Will it be a problem that we only use maxDocs = 1
because it's not searchable yet...

On 03/13/2013 10:00 AM, Upayavira wrote:

Auto commit would seem a good idea, as you don't want your independent
worker threads issuing overlapping commits. There's also commtWithin
that achieves the same thing.

Upayavira

On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote:

Hi

I'm filling our solr database with about 5mil docs. All docs are in some
kind of queue which are processed by 5 simultaneous workers. What is the
best way to do commits is such a situation? If I say to let every worker
do a commit after 100 docs there will be 5 commits in a short period. Or
should I use the autocommit option for this?

Thx!

Arkadi






Re: commit

2013-03-13 Thread Timothy Potter
collection - Plugins / Stats - CORE - searcher



On Wed, Mar 13, 2013 at 4:53 AM, Arkadi Colson ark...@smartbit.be wrote:

 Sorry I'm quite new to solr but where exactly in the admin interface can I
 find how long it takes to warm the index?

 Arkadi


 On 03/13/2013 11:19 AM, Upayavira wrote:

 It depends whether you are using soft commits - that changes things a
 lot.

 If you aren't, then you should look in the admin interface, and see how
 long it takes to warm your index, and commit at least less frequently
 than that (commit more often, and you'll have concurrent warming
 searchers which will use up a lot of your memory).

 If you are, then the commit frequency becomes less important. You could
 use soft commits between 1s and 15s, and hard commits maybe every 15s to
 1min. Those seem to me to be reasonable values.

 Upayavira

 On Wed, Mar 13, 2013, at 09:19 AM, Arkadi Colson wrote:

 What would be a good value for maxTime or maxDocs knowing that we insert
 about 10 docs/sec? Will it be a problem that we only use maxDocs = 1
 because it's not searchable yet...

 On 03/13/2013 10:00 AM, Upayavira wrote:

 Auto commit would seem a good idea, as you don't want your independent
 worker threads issuing overlapping commits. There's also commtWithin
 that achieves the same thing.

 Upayavira

 On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote:

 Hi

 I'm filling our solr database with about 5mil docs. All docs are in
 some
 kind of queue which are processed by 5 simultaneous workers. What is
 the
 best way to do commits is such a situation? If I say to let every
 worker
 do a commit after 100 docs there will be 5 commits in a short period.
 Or
 should I use the autocommit option for this?

 Thx!

 Arkadi






Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread Mark Miller
Try openSearcher instead?

- Mark

On Dec 16, 2012, at 8:18 PM, shreejay shreej...@gmail.com wrote:

 Hello. 
 
 I am running a commit on a solrCloud collection using a cron job. The
 command is as follows:
 
 aa.aa.aa.aa:8983/solr/ABCCollection/update?commit=trueopensearcher=false
 
 But when i see the logs I see the the commit has been called with
 openSearcher=true. 
 
 The directupdatehandler2 in my solrconfig file looks like this:
 /updateHandler class=solr.DirectUpdateHandler2
 
   autoCommit
   maxDocs0/maxDocs
   maxTime0/maxTime
   /autoCommit
autoSoftCommit 
 maxTime0/maxTime 
/autoSoftCommit
 
openSearcherfalse/openSearcher
waitSearcherfalse/waitSearcher  
 
updateLog
  str name=dir${solr.data.dir:}/str
/updateLog
 
  /updateHandler/
 
 
 
 And these are the logs :
 http://pastebin.com/bGh2GRvx
 
 
 I am not sure why openSearcher is being called. I am indexing a ton of
 documents right now, and am not using search at all. Also read in the Wiki,
 that keeping openSearcher=false is recommended for solrcloud.
 http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section
 
 
 Is there some place else where openSearcher has to be set while calling a
 commit? 
 
 
 --Shreejay
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread shreejay
Hi Mark, 

That was a typo in my post. I am using openSearcher only. But still see the
same log files. 

/update/?commit=trueopenSearcher=false




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419p4027451.html
Sent from the Solr - User mailing list archive at Nabble.com.


  1   2   3   >