Re: Need help on aggregation of nested documents
I have created a custom Collector extending SimpleCollector. I can see the methods scoreMode() and collect(int doc). I am seeing that the collect method is invoked by lucene with the child docId. Am I moving in the right direction? But to collect the values I would need the Document by using reader.document(int docID) and then parse it which would be again the same issue i pointed out. Thanks Gopal Sharma On Tue, Nov 16, 2021 at 1:41 PM Adrien Grand wrote: > Indeed you shouldn't load all hits, you should register a > org.apache.lucene.search.Collector that will aggregate data while matches > are being collected. > > Since you are already using a ToChildBlockJoinQuery, you should be able to > use it in conjunction with utility classes from lucene/facets. Have you > looked into it already? > > On Tue, Nov 16, 2021 at 7:30 AM Gopal Sharma > wrote: > > > Hi Adrien, > > > > Thanks for the reply. > > > > I am able to retrieve the child docId's using the .ToChildBlockJoinQuery. > > Now for me to do aggregates i need to find the document using > > reader.document(int docID) right?. If that is the case won't getting all > > the documents would be a costly operation and then finally doing the > > aggregates. > > > > Is there any other way around this? > > > > Thanks > > Gopal Sharma > > > > > > > > > > > > > > > > On Mon, Nov 15, 2021 at 10:36 PM Adrien Grand wrote: > > > > > It's not straightforward as we don't provide high-level tooling to do > > this. > > > You need to use the BitSetProducer that you pass to the > > > ToParentBlockJoinQuery in order to resolve the range of child doc IDs > > for a > > > given parent doc ID (see e.g. how ToChildBlockJoinQuery does it), and > > then > > > aggregate over these child doc IDs. > > > > > > On Mon, Nov 15, 2021 at 6:06 AM Gopal Sharma > > > wrote: > > > > > > > Hi Team, > > > > > > > > I have a document structure as a customer which itself has few > > attributes > > > > like gender, location etc. > > > > > > > > Each customer will have a list of facts like transaction, product > views > > > > etc. > > > > > > > > I want to do an aggregation of the facts. For example find all > > customers > > > > who are from a specific location and have done transactions worth > more > > > than > > > > 500$ between two date ranges. > > > > > > > > The queries can go deeper than this. > > > > > > > > Thanks in advance. > > > > > > > > Gopal Sharma > > > > > > > > > > > > > -- > > > Adrien > > > > > > > > -- > Adrien >
Re: Need help on aggregation of nested documents
Indeed you shouldn't load all hits, you should register a org.apache.lucene.search.Collector that will aggregate data while matches are being collected. Since you are already using a ToChildBlockJoinQuery, you should be able to use it in conjunction with utility classes from lucene/facets. Have you looked into it already? On Tue, Nov 16, 2021 at 7:30 AM Gopal Sharma wrote: > Hi Adrien, > > Thanks for the reply. > > I am able to retrieve the child docId's using the .ToChildBlockJoinQuery. > Now for me to do aggregates i need to find the document using > reader.document(int docID) right?. If that is the case won't getting all > the documents would be a costly operation and then finally doing the > aggregates. > > Is there any other way around this? > > Thanks > Gopal Sharma > > > > > > > > On Mon, Nov 15, 2021 at 10:36 PM Adrien Grand wrote: > > > It's not straightforward as we don't provide high-level tooling to do > this. > > You need to use the BitSetProducer that you pass to the > > ToParentBlockJoinQuery in order to resolve the range of child doc IDs > for a > > given parent doc ID (see e.g. how ToChildBlockJoinQuery does it), and > then > > aggregate over these child doc IDs. > > > > On Mon, Nov 15, 2021 at 6:06 AM Gopal Sharma > > wrote: > > > > > Hi Team, > > > > > > I have a document structure as a customer which itself has few > attributes > > > like gender, location etc. > > > > > > Each customer will have a list of facts like transaction, product views > > > etc. > > > > > > I want to do an aggregation of the facts. For example find all > customers > > > who are from a specific location and have done transactions worth more > > than > > > 500$ between two date ranges. > > > > > > The queries can go deeper than this. > > > > > > Thanks in advance. > > > > > > Gopal Sharma > > > > > > > > > -- > > Adrien > > > -- Adrien
Re: Need help on aggregation of nested documents
Hi Adrien, Thanks for the reply. I am able to retrieve the child docId's using the .ToChildBlockJoinQuery. Now for me to do aggregates i need to find the document using reader.document(int docID) right?. If that is the case won't getting all the documents would be a costly operation and then finally doing the aggregates. Is there any other way around this? Thanks Gopal Sharma On Mon, Nov 15, 2021 at 10:36 PM Adrien Grand wrote: > It's not straightforward as we don't provide high-level tooling to do this. > You need to use the BitSetProducer that you pass to the > ToParentBlockJoinQuery in order to resolve the range of child doc IDs for a > given parent doc ID (see e.g. how ToChildBlockJoinQuery does it), and then > aggregate over these child doc IDs. > > On Mon, Nov 15, 2021 at 6:06 AM Gopal Sharma > wrote: > > > Hi Team, > > > > I have a document structure as a customer which itself has few attributes > > like gender, location etc. > > > > Each customer will have a list of facts like transaction, product views > > etc. > > > > I want to do an aggregation of the facts. For example find all customers > > who are from a specific location and have done transactions worth more > than > > 500$ between two date ranges. > > > > The queries can go deeper than this. > > > > Thanks in advance. > > > > Gopal Sharma > > > > > -- > Adrien >
Re: Need help on aggregation of nested documents
It's not straightforward as we don't provide high-level tooling to do this. You need to use the BitSetProducer that you pass to the ToParentBlockJoinQuery in order to resolve the range of child doc IDs for a given parent doc ID (see e.g. how ToChildBlockJoinQuery does it), and then aggregate over these child doc IDs. On Mon, Nov 15, 2021 at 6:06 AM Gopal Sharma wrote: > Hi Team, > > I have a document structure as a customer which itself has few attributes > like gender, location etc. > > Each customer will have a list of facts like transaction, product views > etc. > > I want to do an aggregation of the facts. For example find all customers > who are from a specific location and have done transactions worth more than > 500$ between two date ranges. > > The queries can go deeper than this. > > Thanks in advance. > > Gopal Sharma > -- Adrien
Need help on aggregation of nested documents
Hi Team, I have a document structure as a customer which itself has few attributes like gender, location etc. Each customer will have a list of facts like transaction, product views etc. I want to do an aggregation of the facts. For example find all customers who are from a specific location and have done transactions worth more than 500$ between two date ranges. The queries can go deeper than this. Thanks in advance. Gopal Sharma