Okay thanks for the tip.  I am pretty wary of streaming logs into my
main set of documents + tons of $stat_updated_at fields + resetting
stats on ~every document every day + whatever else we feel like
trending.  It just feels like a lot of churn.

I will lean towards the !join on stats-$DATE probably.

On Tue, Sep 1, 2020 at 11:32 AM Erick Erickson <erickerick...@gmail.com> wrote:
>
> I wouldn’t use ExternalFileField if your use-case is served by in-place 
> updates. See
>
> https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates
>
> EFFs were put in in order to have _some_ capability to change individual 
> fields in a doc
> long before in-place updates were around and long before SolrCloud. Using EFF 
> in any
> kind of sharded system will cause you significant heartburn in terms of 
> keeping the
> file up to date on all replicas.
>
> Best,
> Erick
>
> > On Sep 1, 2020, at 11:21 AM, matthew sporleder <msporle...@gmail.com> wrote:
> >
> > We are researching the canonical use case for external fields --
> > traffic-based rankings
> >
> > What are the practical limits on the size of the external field file?
> > A k=v text file seems like it might fall over if it grows into the GB
> > range?
> >
> > Our other thought is to use rolling cores where we stream in web logs
> > and use !join queries.
> >
> > Does anyone have practical experience with this that they might want to 
> > share?
> >
> > Thanks,
> > Matt
>

Reply via email to