Re: external field file size

2020-09-01 Thread matthew sporleder
Okay thanks for the tip.  I am pretty wary of streaming logs into my
main set of documents + tons of $stat_updated_at fields + resetting
stats on ~every document every day + whatever else we feel like
trending.  It just feels like a lot of churn.

I will lean towards the !join on stats-$DATE probably.

On Tue, Sep 1, 2020 at 11:32 AM Erick Erickson  wrote:
>
> I wouldn’t use ExternalFileField if your use-case is served by in-place 
> updates. See
>
> https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates
>
> EFFs were put in in order to have _some_ capability to change individual 
> fields in a doc
> long before in-place updates were around and long before SolrCloud. Using EFF 
> in any
> kind of sharded system will cause you significant heartburn in terms of 
> keeping the
> file up to date on all replicas.
>
> Best,
> Erick
>
> > On Sep 1, 2020, at 11:21 AM, matthew sporleder  wrote:
> >
> > We are researching the canonical use case for external fields --
> > traffic-based rankings
> >
> > What are the practical limits on the size of the external field file?
> > A k=v text file seems like it might fall over if it grows into the GB
> > range?
> >
> > Our other thought is to use rolling cores where we stream in web logs
> > and use !join queries.
> >
> > Does anyone have practical experience with this that they might want to 
> > share?
> >
> > Thanks,
> > Matt
>


Re: external field file size

2020-09-01 Thread Erick Erickson
I wouldn’t use ExternalFileField if your use-case is served by in-place 
updates. See 

https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates

EFFs were put in in order to have _some_ capability to change individual fields 
in a doc
long before in-place updates were around and long before SolrCloud. Using EFF 
in any
kind of sharded system will cause you significant heartburn in terms of keeping 
the
file up to date on all replicas.

Best,
Erick

> On Sep 1, 2020, at 11:21 AM, matthew sporleder  wrote:
> 
> We are researching the canonical use case for external fields --
> traffic-based rankings
> 
> What are the practical limits on the size of the external field file?
> A k=v text file seems like it might fall over if it grows into the GB
> range?
> 
> Our other thought is to use rolling cores where we stream in web logs
> and use !join queries.
> 
> Does anyone have practical experience with this that they might want to share?
> 
> Thanks,
> Matt



external field file size

2020-09-01 Thread matthew sporleder
We are researching the canonical use case for external fields --
traffic-based rankings

What are the practical limits on the size of the external field file?
A k=v text file seems like it might fall over if it grows into the GB
range?

Our other thought is to use rolling cores where we stream in web logs
and use !join queries.

Does anyone have practical experience with this that they might want to share?

Thanks,
Matt