> Note: adding score=none as a local param. Turns another algorithm > dragging by from side join.
Indeed, the behavior with score=none local param is a query time correlated with the joined collection subset size. For subset of 100k documenrs, the query time is 1 seconds, 4 sec for 1M I get client timeout (15sec) for any superior to 5M. On this basis I guess some redesign will be necessary to find the good in between normalization and de-normalization for insertion/selection speed trade-off Thanks On Wed, Oct 16, 2019 at 03:32:33PM +0300, Mikhail Khludnev wrote: > Note: adding score=none as a local param. Turns another algorithm dragging > by from side join. > > On Wed, Oct 16, 2019 at 11:37 AM Nicolas Paris <nicolas.pa...@riseup.net> > wrote: > > > Sadly, the join performances are poor. > > The joined collection is 12M documents, and the performances are 6k ms > > versus 60ms when I compare to the denormalized field. > > > > Apparently, the performances does not change when the filter on the > > joined collection is changed. It is still 6k ms when the subset is 12M > > or 1 document in size. So the performance of join looks correlated to > > size of joined collection and not the kind of filter applied to it. > > > > I will explore the streaming expressions > > > > On Wed, Oct 16, 2019 at 08:00:43AM +0200, Nicolas Paris wrote: > > > > You can certainly replicate the joined collection to every shard. It > > > > must fit in one shard and a replica of that shard must be co-located > > > > with every replica of the “to” collection. > > > > > > Yes, I found this in the documentation, with a clear example just after > > > this mail. I will test it today. I also read your blog about join > > > performances[1] and I suspect the performance impact of joins will be > > > huge because the joined collection is about 10M documents (only two > > > fields, unique id and an array of longs and a filter applied to the > > > array, join key is 10M unique IDs). > > > > > > > Have you looked at streaming and “streaming expressions"? It does not > > > > have the same problem, although it does have its own limitations. > > > > > > I never tested them, and I am not very confortable yet in how to test > > > them. Is it possible to mix query parsers and streaming expression in > > > the client call via http parameters - or is streaming expression apply > > > programmatically only ? > > > > > > [1] https://lucidworks.com/post/solr-and-joins/ > > > > > > On Tue, Oct 15, 2019 at 07:12:25PM -0400, Erick Erickson wrote: > > > > You can certainly replicate the joined collection to every shard. It > > must fit in one shard and a replica of that shard must be co-located with > > every replica of the “to” collection. > > > > > > > > Have you looked at streaming and “streaming expressions"? It does not > > have the same problem, although it does have its own limitations. > > > > > > > > Best, > > > > Erick > > > > > > > > > On Oct 15, 2019, at 6:58 PM, Nicolas Paris <nicolas.pa...@riseup.net> > > wrote: > > > > > > > > > > Hi > > > > > > > > > > I have several large collections that cannot fit in a standalone solr > > > > > instance. They are split over multiple shards in solr-cloud mode. > > > > > > > > > > Those collections are supposed to be joined to an other collection to > > > > > retrieve subset. Because I am using distributed collections, I am not > > > > > able to use the solr join feature. > > > > > > > > > > For this reason, I denormalize the information by adding the joined > > > > > collection within every collections. Naturally, when I want to update > > > > > the joined collection, I have to update every one of the distributed > > > > > collections. > > > > > > > > > > In standalone mode, I only would have to update the joined > > collection. > > > > > > > > > > I wonder if there is a way to overcome this limitation. For example, > > by > > > > > replicating the joined collection to every shard - or other method I > > am > > > > > ignoring. > > > > > > > > > > Any thought ? > > > > > -- > > > > > nicolas > > > > > > > > > > -- > > > nicolas > > > > > > > -- > > nicolas > > > > > -- > Sincerely yours > Mikhail Khludnev -- nicolas