Re: external field file size
Okay thanks for the tip. I am pretty wary of streaming logs into my main set of documents + tons of $stat_updated_at fields + resetting stats on ~every document every day + whatever else we feel like trending. It just feels like a lot of churn. I will lean towards the !join on stats-$DATE probably. On Tue, Sep 1, 2020 at 11:32 AM Erick Erickson wrote: > > I wouldn’t use ExternalFileField if your use-case is served by in-place > updates. See > > https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates > > EFFs were put in in order to have _some_ capability to change individual > fields in a doc > long before in-place updates were around and long before SolrCloud. Using EFF > in any > kind of sharded system will cause you significant heartburn in terms of > keeping the > file up to date on all replicas. > > Best, > Erick > > > On Sep 1, 2020, at 11:21 AM, matthew sporleder wrote: > > > > We are researching the canonical use case for external fields -- > > traffic-based rankings > > > > What are the practical limits on the size of the external field file? > > A k=v text file seems like it might fall over if it grows into the GB > > range? > > > > Our other thought is to use rolling cores where we stream in web logs > > and use !join queries. > > > > Does anyone have practical experience with this that they might want to > > share? > > > > Thanks, > > Matt >
Re: external field file size
I wouldn’t use ExternalFileField if your use-case is served by in-place updates. See https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates EFFs were put in in order to have _some_ capability to change individual fields in a doc long before in-place updates were around and long before SolrCloud. Using EFF in any kind of sharded system will cause you significant heartburn in terms of keeping the file up to date on all replicas. Best, Erick > On Sep 1, 2020, at 11:21 AM, matthew sporleder wrote: > > We are researching the canonical use case for external fields -- > traffic-based rankings > > What are the practical limits on the size of the external field file? > A k=v text file seems like it might fall over if it grows into the GB > range? > > Our other thought is to use rolling cores where we stream in web logs > and use !join queries. > > Does anyone have practical experience with this that they might want to share? > > Thanks, > Matt
external field file size
We are researching the canonical use case for external fields -- traffic-based rankings What are the practical limits on the size of the external field file? A k=v text file seems like it might fall over if it grows into the GB range? Our other thought is to use rolling cores where we stream in web logs and use !join queries. Does anyone have practical experience with this that they might want to share? Thanks, Matt