Re: Atomic Updates : Performance Impact

2018-02-23 Thread Uday Jami
Thanks Erick for the useful information. Will keep the below points in mind
while designing my solution.

Thanks,
Uday

On Sat, Feb 24, 2018 at 12:47 AM, Erick Erickson 
wrote:

> bq: However if i dont have majority of other column data while doing update
> operations, is it better to go with atomic update?
>
> I don't understand what you're asking. To use Atomic Updates, _every_
> original field (i.e. any field that is _not_ the destination of a
> copyField directive) must be stored. That's just a basic requirement.
>
> bq: And also during the update process, if there is a simultaneous search
> request on the collection, will there be any lag in response?
>
> This is just like any other update, the changes will be visible after
> the next soft commit or hard-commmit-with-opensearcher-true.
>
> Best,
> Erick
>
> On Fri, Feb 23, 2018 at 9:39 AM, Uday Jami  wrote:
> > Hello Erick,
> >
> > Thanks for the explanation.
> > However if i dont have majority of other column data while doing update
> > operations, is it better to go with atomic update?
> >
> > And also during the update process, if there is a simultaneous search
> > request on the collection, will there be any lag in response?
> >
> >
> > Thanks,
> > Uday
> >
> > On Fri, Feb 23, 2018 at 10:47 PM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> The approximate amount of work will be very close to what it would
> >> take Solr to just index the documents from a client. Actually it puts
> >> a little _more_ of a load on Solr. In the case you do an Atomic
> >> Update, Solr has to
> >> 1> fetch all the stored fields from the index
> >> 2> construct a Solr document
> >> 3> change the values in the doc based on the atomic update
> >> 4> re-index the doc just as though it had received it from a client.
> >>
> >> Whereas if you just send the doc from an external client Solr has to
> >> 1> de-serialize the doc
> >> 2> index it (identical to step 4 above)
> >>
> >> The sweet spot for Atomic Updates is when you can't easily get the
> >> original document from the system-of-record.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Feb 23, 2018 at 9:02 AM, Uday Jami  wrote:
> >> > Can you please let me know what will be the performance impact of
> trying
> >> to
> >> > update 120Million records in a collection containing 1 billion
> records.
> >> > The collection contains around 30 columns and only one column out of
> it
> >> is
> >> > updated as part of atomic update.
> >> > Its not a batch update, the 120 Million updates will happen within 24
> >> hours.
> >> >
> >> > How is the search on the above collection going to get impacted during
> >> the
> >> > above update process.
> >> >
> >> > Thanks,
> >> > Uday
> >>
>


Re: Atomic Updates : Performance Impact

2018-02-23 Thread Erick Erickson
bq: However if i dont have majority of other column data while doing update
operations, is it better to go with atomic update?

I don't understand what you're asking. To use Atomic Updates, _every_
original field (i.e. any field that is _not_ the destination of a
copyField directive) must be stored. That's just a basic requirement.

bq: And also during the update process, if there is a simultaneous search
request on the collection, will there be any lag in response?

This is just like any other update, the changes will be visible after
the next soft commit or hard-commmit-with-opensearcher-true.

Best,
Erick

On Fri, Feb 23, 2018 at 9:39 AM, Uday Jami  wrote:
> Hello Erick,
>
> Thanks for the explanation.
> However if i dont have majority of other column data while doing update
> operations, is it better to go with atomic update?
>
> And also during the update process, if there is a simultaneous search
> request on the collection, will there be any lag in response?
>
>
> Thanks,
> Uday
>
> On Fri, Feb 23, 2018 at 10:47 PM, Erick Erickson 
> wrote:
>
>> The approximate amount of work will be very close to what it would
>> take Solr to just index the documents from a client. Actually it puts
>> a little _more_ of a load on Solr. In the case you do an Atomic
>> Update, Solr has to
>> 1> fetch all the stored fields from the index
>> 2> construct a Solr document
>> 3> change the values in the doc based on the atomic update
>> 4> re-index the doc just as though it had received it from a client.
>>
>> Whereas if you just send the doc from an external client Solr has to
>> 1> de-serialize the doc
>> 2> index it (identical to step 4 above)
>>
>> The sweet spot for Atomic Updates is when you can't easily get the
>> original document from the system-of-record.
>>
>> Best,
>> Erick
>>
>> On Fri, Feb 23, 2018 at 9:02 AM, Uday Jami  wrote:
>> > Can you please let me know what will be the performance impact of trying
>> to
>> > update 120Million records in a collection containing 1 billion records.
>> > The collection contains around 30 columns and only one column out of it
>> is
>> > updated as part of atomic update.
>> > Its not a batch update, the 120 Million updates will happen within 24
>> hours.
>> >
>> > How is the search on the above collection going to get impacted during
>> the
>> > above update process.
>> >
>> > Thanks,
>> > Uday
>>


Re: Atomic Updates : Performance Impact

2018-02-23 Thread Uday Jami
Hello Erick,

Thanks for the explanation.
However if i dont have majority of other column data while doing update
operations, is it better to go with atomic update?

And also during the update process, if there is a simultaneous search
request on the collection, will there be any lag in response?


Thanks,
Uday

On Fri, Feb 23, 2018 at 10:47 PM, Erick Erickson 
wrote:

> The approximate amount of work will be very close to what it would
> take Solr to just index the documents from a client. Actually it puts
> a little _more_ of a load on Solr. In the case you do an Atomic
> Update, Solr has to
> 1> fetch all the stored fields from the index
> 2> construct a Solr document
> 3> change the values in the doc based on the atomic update
> 4> re-index the doc just as though it had received it from a client.
>
> Whereas if you just send the doc from an external client Solr has to
> 1> de-serialize the doc
> 2> index it (identical to step 4 above)
>
> The sweet spot for Atomic Updates is when you can't easily get the
> original document from the system-of-record.
>
> Best,
> Erick
>
> On Fri, Feb 23, 2018 at 9:02 AM, Uday Jami  wrote:
> > Can you please let me know what will be the performance impact of trying
> to
> > update 120Million records in a collection containing 1 billion records.
> > The collection contains around 30 columns and only one column out of it
> is
> > updated as part of atomic update.
> > Its not a batch update, the 120 Million updates will happen within 24
> hours.
> >
> > How is the search on the above collection going to get impacted during
> the
> > above update process.
> >
> > Thanks,
> > Uday
>


Re: Atomic Updates : Performance Impact

2018-02-23 Thread Erick Erickson
The approximate amount of work will be very close to what it would
take Solr to just index the documents from a client. Actually it puts
a little _more_ of a load on Solr. In the case you do an Atomic
Update, Solr has to
1> fetch all the stored fields from the index
2> construct a Solr document
3> change the values in the doc based on the atomic update
4> re-index the doc just as though it had received it from a client.

Whereas if you just send the doc from an external client Solr has to
1> de-serialize the doc
2> index it (identical to step 4 above)

The sweet spot for Atomic Updates is when you can't easily get the
original document from the system-of-record.

Best,
Erick

On Fri, Feb 23, 2018 at 9:02 AM, Uday Jami  wrote:
> Can you please let me know what will be the performance impact of trying to
> update 120Million records in a collection containing 1 billion records.
> The collection contains around 30 columns and only one column out of it is
> updated as part of atomic update.
> Its not a batch update, the 120 Million updates will happen within 24 hours.
>
> How is the search on the above collection going to get impacted during the
> above update process.
>
> Thanks,
> Uday