Re: Parent child relationship, where children aren't nested but separate (like elasticsearch)

2016-11-17 Thread Dorian Hoxha
It's not mentioned on that page, but I'm assuming the join should work on
solrcloud when joining the same collection with the same routing (example:
users and user_events both routed by user_id (and joining on user_id))


On Thu, Nov 17, 2016 at 10:23 AM, Alexandre Rafalovitch 
wrote:

> You want just the usual join (not the block-join). That's the way it
> was before nested documents became supported.
> https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-JoinQueryParser
>
> Also, Elasticsearch - as far as I remember - stores the original
> document structure (including children) as a special field and then
> flattens all the children into parallel fields within parent. Which
> causes interesting hidden ranking issues, but that's an issue for a
> different day.
>
> Rgards,
>Alex.
> 
> Solr Example reading group is starting November 2016, join us at
> http://j.mp/SolrERG
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 17 November 2016 at 18:08, Dorian Hoxha  wrote:
> > Hi,
> >
> > I'm not finding a way to support parent-child like es does (using
> > blockjoin)? I've seen some blogs
> >  nested-documents-in-apache-solr>
> > with having children as nested inside the parent-document, but I want to
> > freely crud childs/parents as separate documents (i know that nested also
> > writes separate documents) and have a special field to link them +
> manually
> > route them to the same shard.
> >
> > Is this possible/available ?
> >
> > Thank You
>


Re: Parent child relationship, where children aren't nested but separate (like elasticsearch)

2016-11-17 Thread Alexandre Rafalovitch
You want just the usual join (not the block-join). That's the way it
was before nested documents became supported.
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser

Also, Elasticsearch - as far as I remember - stores the original
document structure (including children) as a special field and then
flattens all the children into parallel fields within parent. Which
causes interesting hidden ranking issues, but that's an issue for a
different day.

Rgards,
   Alex.

Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 17 November 2016 at 18:08, Dorian Hoxha  wrote:
> Hi,
>
> I'm not finding a way to support parent-child like es does (using
> blockjoin)? I've seen some blogs
> 
> with having children as nested inside the parent-document, but I want to
> freely crud childs/parents as separate documents (i know that nested also
> writes separate documents) and have a special field to link them + manually
> route them to the same shard.
>
> Is this possible/available ?
>
> Thank You


Parent child relationship, where children aren't nested but separate (like elasticsearch)

2016-11-16 Thread Dorian Hoxha
Hi,

I'm not finding a way to support parent-child like es does (using
blockjoin)? I've seen some blogs

with having children as nested inside the parent-document, but I want to
freely crud childs/parents as separate documents (i know that nested also
writes separate documents) and have a special field to link them + manually
route them to the same shard.

Is this possible/available ?

Thank You


Does updating a child document destroy the parent - child relationship

2014-06-24 Thread Vinay B,
When I edit a child document, a block join query for the parent no longer
returns any hits. I thought I read that this was the way things worked but
needed to know for sure.

If so, is there any other way to achieve this functionality (I can deal
with creating the child doc with the parent, but would like to edit it
separately).

My rough prototype code is at

https://github.com/balamuru/SolrChildDocs

and the code in question is commented out in
https://github.com/balamuru/SolrChildDocs/blob/master/src/main/java/com/vgb/solr/SolrApp.java


Thanks


Re: Does updating a child document destroy the parent - child relationship

2014-06-24 Thread Jack Krupansky
Block join is a very specialized feature of Solr - it requires that creation 
and update of the parent and all children be done as a single update 
operation for all of the documents. So... you cannot update a child document 
by itself, but need to update the entire block.


Unfortunately, this limitation does not appear to be documented in the Solr 
ref guide.


-- Jack Krupansky

-Original Message- 
From: Vinay B,

Sent: Tuesday, June 24, 2014 10:40 PM
To: solr-user
Subject: Does updating a child document destroy the parent - child 
relationship


When I edit a child document, a block join query for the parent no longer
returns any hits. I thought I read that this was the way things worked but
needed to know for sure.

If so, is there any other way to achieve this functionality (I can deal
with creating the child doc with the parent, but would like to edit it
separately).

My rough prototype code is at

https://github.com/balamuru/SolrChildDocs

and the code in question is commented out in
https://github.com/balamuru/SolrChildDocs/blob/master/src/main/java/com/vgb/solr/SolrApp.java


Thanks 



Re: Parent-Child relationship

2012-05-04 Thread tamanjit.bin...@yahoo.co.in
Hi,
As per my understanding the join is confined to a single core only and it is
not possible to have joins between docs of different cores. Am I correct
here? If yes, is there a possibility of having joins across cores anytime
soon?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259p3961509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Parent-Child relationship

2012-05-04 Thread Erick Erickson
See: https://issues.apache.org/jira/browse/LUCENE-3759

No time-frame mentioned though.

Best
Erick

On Fri, May 4, 2012 at 4:20 AM, tamanjit.bin...@yahoo.co.in
tamanjit.bin...@yahoo.co.in wrote:
 Hi,
 As per my understanding the join is confined to a single core only and it is
 not possible to have joins between docs of different cores. Am I correct
 here? If yes, is there a possibility of having joins across cores anytime
 soon?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259p3961509.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Parent-Child relationship

2012-05-03 Thread tamanjit.bin...@yahoo.co.in
Hi,
I just wanted to get some information about whether Parent-Child
relationship between documents which Lucene has been talking about has been
implemented in Solr or not? I know join patch is available, would that be
the only solution?

And another question, as and when this will be possible (if its not done
already), would such a functionality (whether join or defining such
relations at index time) would be available across different cores?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Parent-Child relationship

2012-05-03 Thread Mikhail Khludnev
Hello,

Here is my favorite ones:
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
https://issues.apache.org/jira/browse/SOLR-3076

On Thu, May 3, 2012 at 10:17 AM, tamanjit.bin...@yahoo.co.in 
tamanjit.bin...@yahoo.co.in wrote:

 Hi,
 I just wanted to get some information about whether Parent-Child
 relationship between documents which Lucene has been talking about has been
 implemented in Solr or not? I know join patch is available, would that be
 the only solution?

 And another question, as and when this will be possible (if its not done
 already), would such a functionality (whether join or defining such
 relations at index time) would be available across different cores?

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Parent-Child relationship

2012-05-03 Thread Erick Erickson
Solr join has been implemented for quite some time, see:
https://issues.apache.org/jira/browse/SOLR-2272
but only on trunk.

3076 is a refinement as I understand it.

FWIW
Erick

On Thu, May 3, 2012 at 3:01 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
 Hello,

 Here is my favorite ones:
 http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
 https://issues.apache.org/jira/browse/SOLR-3076

 On Thu, May 3, 2012 at 10:17 AM, tamanjit.bin...@yahoo.co.in 
 tamanjit.bin...@yahoo.co.in wrote:

 Hi,
 I just wanted to get some information about whether Parent-Child
 relationship between documents which Lucene has been talking about has been
 implemented in Solr or not? I know join patch is available, would that be
 the only solution?

 And another question, as and when this will be possible (if its not done
 already), would such a functionality (whether join or defining such
 relations at index time) would be available across different cores?

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 Sincerely yours
 Mikhail Khludnev
 Tech Lead
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com


Re: Parent-Child relationship

2012-05-03 Thread Mikhail Khludnev
Erick,

Generally I agree, but could you please expand your definition is
refinement. What does it mean?
I suggested SOLR-3076, because index time has been mention.


On Thu, May 3, 2012 at 5:35 PM, Erick Erickson erickerick...@gmail.comwrote:

 Solr join has been implemented for quite some time, see:
 https://issues.apache.org/jira/browse/SOLR-2272
 but only on trunk.

 3076 is a refinement as I understand it.

 FWIW
 Erick

 On Thu, May 3, 2012 at 3:01 AM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
  Hello,
 
  Here is my favorite ones:
 
 http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
  https://issues.apache.org/jira/browse/SOLR-3076
 
  On Thu, May 3, 2012 at 10:17 AM, tamanjit.bin...@yahoo.co.in 
  tamanjit.bin...@yahoo.co.in wrote:
 
  Hi,
  I just wanted to get some information about whether Parent-Child
  relationship between documents which Lucene has been talking about has
 been
  implemented in Solr or not? I know join patch is available, would that
 be
  the only solution?
 
  And another question, as and when this will be possible (if its not done
  already), would such a functionality (whether join or defining such
  relations at index time) would be available across different cores?
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Tech Lead
  Grid Dynamics
 
  http://www.griddynamics.com
   mkhlud...@griddynamics.com




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Parent-Child relationship

2012-05-03 Thread Erick Erickson
Right. See: 
http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/contrib-join/org/apache/lucene/search/join/package-summary.html

I guess refinement wasn't a good word choice. The basic join stuff
has been in Solr for a while (2272), but 3076 refers to exposing
functionality that currently exists in Lucne for use in Solr. So
depending on what you want to do with joins, it may already be in
Solr.../.

Best
Erick

On Thu, May 3, 2012 at 12:42 PM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
 Erick,

 Generally I agree, but could you please expand your definition is
 refinement. What does it mean?
 I suggested SOLR-3076, because index time has been mention.


 On Thu, May 3, 2012 at 5:35 PM, Erick Erickson erickerick...@gmail.comwrote:

 Solr join has been implemented for quite some time, see:
 https://issues.apache.org/jira/browse/SOLR-2272
 but only on trunk.

 3076 is a refinement as I understand it.

 FWIW
 Erick

 On Thu, May 3, 2012 at 3:01 AM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
  Hello,
 
  Here is my favorite ones:
 
 http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
  https://issues.apache.org/jira/browse/SOLR-3076
 
  On Thu, May 3, 2012 at 10:17 AM, tamanjit.bin...@yahoo.co.in 
  tamanjit.bin...@yahoo.co.in wrote:
 
  Hi,
  I just wanted to get some information about whether Parent-Child
  relationship between documents which Lucene has been talking about has
 been
  implemented in Solr or not? I know join patch is available, would that
 be
  the only solution?
 
  And another question, as and when this will be possible (if its not done
  already), would such a functionality (whether join or defining such
  relations at index time) would be available across different cores?
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Parent-Child-relationship-tp3958259.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Tech Lead
  Grid Dynamics
 
  http://www.griddynamics.com
   mkhlud...@griddynamics.com




 --
 Sincerely yours
 Mikhail Khludnev
 Tech Lead
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com


Special Parent / Child relationship - advice / observations welcome on how to approach this

2010-11-23 Thread Bob Sandiford
Hi,

Long post - sorry...

I have a relatively special case of a Parent / Child relationship that I'm 
trying to model.

I'm currently using Solr 1.4.1 and Lucene 2.9.3

For example, my Parent documents represent Title Information (e.g. 
bibliographic information), and each Parent document can contain 0 or more 
Children, each child representing a physical copy of the Book information (e.g. 
think of a library with multiple branches, the child documents each represent a 
books (or other format) available at a given branch).

What I *think* is special about this set up is that the only information I need 
Solr / Lucene to make use of is all Facet based, AND, I don't need facet counts 
for these child facets.

So, for example, Harry Potter and the Last Crusade  (J.K. Rowlings' upcoming 
block buster novel :)) has the following information.

Location   Format
  Main  Book
  Main  DVD
  Branch  Book

etc etc.  This is a bit simplified - there are actually 5 fields involved for 
each child document, each of the five fields can be used individually (easy!) 
or in combination (much harder!) to refine a result set.

With the relatively straightforward approach of having Location and Format as 
ordinary everyday facet fields, I certainly get results (although I ignore the 
facet counts).  So, I search for the book, and, without any facets applied, get 
back facets like this:

Location
   Main
   Branch

Format
  Book
  DVD

And, I can narrow by those, and things work - though logically there's a hole 
that I'm trying to fill.

For example, suppose the user chooses to narrow by Location:Branch and 
Format:DVD.  I still get a hit back - but I don't want one, because there isn't 
a child record that has both of those values.  (The user is looking for DVD's 
at the Branch library, but the only DVD is at Main).


I'm completely controlling both the indexing and searching side code - i.e. I 
can formulate any type of document content I want to be indexed, and I can 
parse the results before presenting them to the end users.


One approach I've been thinking of is a brute force method of accomplishing 
this, using Facets and using the facet.prefix parameter in the query.

So - I could generate 'facets' like this:

Location_facet
   Main
  Branch

Format_facet
  Book
  DVD

Location-Format_facet
  Main-Book
  Main-DVD
  Branch-Book

Format-Location_facet
  Book-Main
  Book-Branch
  DVD-Main

When narrowing by a single facet (e.g. Location:Branch), it would be a usual 
facet search.  Something like:

   fq=Location_facet:Branch

and I would request back facets like this

  
facet=truefacet.mincount=1facet.field=Location-Format_facetfacet.prefix=Branch-

and then parse out the values returned in the Location-Format_facet to retrieve 
what follows the Branch- prefix, and those would be the facet values for the 
'Format' facet presented to the users (so only 'Book' remains as a value).

So - with only 2 fields, it's pretty straightforward.

(It could be somewhat simplified from the above down to two facet fields 
instead of four - just keeping the paired facets, and not using the singleton 
facets, retrieving just one of those paired fields when no limiting is taking 
place, and parsing out the pairs for the Location and Format facets, and then 
when limiting on one element use facet.prefix, when limiting on both, again 
just choose one of the facets and look for the concatenated value...)


However it gets more complex as I ramp up to 5 fields.  (generally it requires 
n! individual facet fields, where 'n' is the number of underlying fields.  i.e. 
with two fields, there are two facet type fields needed in the Solr/Lucene 
index to support this.  With three fields, I could do this with 6 facets 
required.  With 5 fields, there would be  5! = 120 required facets.  That's 
getting a bit much... :)   Hmmm...   A little scribbling (ok, a fair bit of 
scribbling), and I can actually reduce that to 12 facet fields to cover the 5 
fields.   So, maybe that's not all that bad...  Interesting...

(I haven't actually coded up anything yet - this is all a paper-napkin level 
exercise...)



The other thing I've done is perused various archived threads and some upcoming 
functionality regarding parent / child or hierarchical document strategies.  
But, I haven't found anything that would help me out much - at least not 
directly.

I saw the Jira  LUCENE-2454https://issues.apache.org/jira/browse/LUCENE-2454  
Nested Document Query Support. which looks from the slides overview to be 
structurally just what I would want - but indicates that there isn't the Query 
Parser support in place yet...  (I.E. how to do a Solr query being able to 
relate child level queries either within the base query, or in the fq clause...


So - my question (finally :)) is - does this logical problem seem resolvable, 
with an approach other than the brute force outlined above?  I'm willing

RE: Special Parent / Child relationship - advice / observations welcome on how to approach this

2010-11-23 Thread Jonathan Rochkind
I gather that your solr documents are the Title Information units. Have you 
considered making your Solr document collection be the book information units 
instead?   Each book information document will have (yes, de-normalized) the 
same title information as all the other book documents belonging to the same 
'title information'.  You can even give each 'book information' document some 
kind of 'title information' id that can be used to fetch all 'book information' 
documents belonging to the same 'title information'.  (If we just call these 
'bibs' and 'holdings' this might be less confusing for us library people). 

No doubt modelling things this way will bring it's own challenges, but it will 
solve the particular problems you mention, I believe. Solr is not an rdbms, and 
de-normalizing to the right level, so your solr documents represent the proper 
units of granularity for the kinds of queries you want to do, is usually the 
trick to getting solr to do what you want. 

The challenge, of course, is when the kinds of queries you want really require 
multiple different levels of granularity.  I haven't found any great general 
purpose solutions to this problem, it's sort of the gap between what solr is 
good at and what an rdbms is good at. 

From: Bob Sandiford [bob.sandif...@sirsidynix.com]
Sent: Tuesday, November 23, 2010 7:26 PM
To: solr-user@lucene.apache.org
Subject: Special Parent / Child relationship - advice / observations welcome on 
how to approach this

Hi,

Long post - sorry...

I have a relatively special case of a Parent / Child relationship that I'm 
trying to model.

I'm currently using Solr 1.4.1 and Lucene 2.9.3

For example, my Parent documents represent Title Information (e.g. 
bibliographic information), and each Parent document can contain 0 or more 
Children, each child representing a physical copy of the Book information (e.g. 
think of a library with multiple branches, the child documents each represent a 
books (or other format) available at a given branch).

What I *think* is special about this set up is that the only information I need 
Solr / Lucene to make use of is all Facet based, AND, I don't need facet counts 
for these child facets.

So, for example, Harry Potter and the Last Crusade  (J.K. Rowlings' upcoming 
block buster novel :)) has the following information.

Location   Format
  Main  Book
  Main  DVD
  Branch  Book

etc etc.  This is a bit simplified - there are actually 5 fields involved for 
each child document, each of the five fields can be used individually (easy!) 
or in combination (much harder!) to refine a result set.

With the relatively straightforward approach of having Location and Format as 
ordinary everyday facet fields, I certainly get results (although I ignore the 
facet counts).  So, I search for the book, and, without any facets applied, get 
back facets like this:

Location
   Main
   Branch

Format
  Book
  DVD

And, I can narrow by those, and things work - though logically there's a hole 
that I'm trying to fill.

For example, suppose the user chooses to narrow by Location:Branch and 
Format:DVD.  I still get a hit back - but I don't want one, because there isn't 
a child record that has both of those values.  (The user is looking for DVD's 
at the Branch library, but the only DVD is at Main).


I'm completely controlling both the indexing and searching side code - i.e. I 
can formulate any type of document content I want to be indexed, and I can 
parse the results before presenting them to the end users.


One approach I've been thinking of is a brute force method of accomplishing 
this, using Facets and using the facet.prefix parameter in the query.

So - I could generate 'facets' like this:

Location_facet
   Main
  Branch

Format_facet
  Book
  DVD

Location-Format_facet
  Main-Book
  Main-DVD
  Branch-Book

Format-Location_facet
  Book-Main
  Book-Branch
  DVD-Main

When narrowing by a single facet (e.g. Location:Branch), it would be a usual 
facet search.  Something like:

   fq=Location_facet:Branch

and I would request back facets like this

  
facet=truefacet.mincount=1facet.field=Location-Format_facetfacet.prefix=Branch-

and then parse out the values returned in the Location-Format_facet to retrieve 
what follows the Branch- prefix, and those would be the facet values for the 
'Format' facet presented to the users (so only 'Book' remains as a value).

So - with only 2 fields, it's pretty straightforward.

(It could be somewhat simplified from the above down to two facet fields 
instead of four - just keeping the paired facets, and not using the singleton 
facets, retrieving just one of those paired fields when no limiting is taking 
place, and parsing out the pairs for the Location and Format facets, and then 
when limiting on one element use facet.prefix, when limiting on both, again 
just choose one of the facets

Best wasy to solve Parent-Child relationship without Denormalizing?

2010-01-19 Thread karthi_1986

Hi,

Here is an extract of my data schema in which my user should be able to
issue the following search:
company_description:pharmaceutical AND product_description:cosmetic

[Company profile]
 Company name
 Company url
 Company description
 Company user rating

[Product profile]
 Product name
 Product category
 Product description
 Product rating

So, I'm expecting a result where all cosmetic products created by
pharmaceutical companies are returned.

The problem is, I've read in posts a year old that this parent-child
relationship can only be solved by indexing the denormalized data together.
However, I'm dealing with 10,000,000 companies with possibly 10 products
each, so my data requirements are going to be HUGGEE!!

Is there a new feature in Solr which can handle this for me without the need
for de-normalization?
-- 
View this message in context: 
http://old.nabble.com/Best-wasy-to-solve-Parent-Child-relationship-without-Denormalizing--tp27225593p27225593.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Best wasy to solve Parent-Child relationship without Denormalizing?

2010-01-19 Thread Renaud Delbru

Hi,

SIREn [1] could help you to solve this task (look at the different 
indexing examples). But actually, only a Lucene extension is available. 
If you want to use it into Solr, you will have to implement your own 
Solr plugin (which should require only a limited amount of work).


[1] http://siren.sindice.com/
--
Renaud Delbru

On 19/01/10 13:14, karthi_1986 wrote:

Hi,

Here is an extract of my data schema in which my user should be able to
issue the following search:
company_description:pharmaceutical AND product_description:cosmetic

[Company profile]
 Company name
 Company url
 Company description
 Company user rating

[Product profile]
 Product name
 Product category
 Product description
 Product rating

So, I'm expecting a result where all cosmetic products created by
pharmaceutical companies are returned.

The problem is, I've read in posts a year old that this parent-child
relationship can only be solved by indexing the denormalized data together.
However, I'm dealing with 10,000,000 companies with possibly 10 products
each, so my data requirements are going to be HUGGEE!!

Is there a new feature in Solr which can handle this for me without the need
for de-normalization?
   




Re: storing multiple type of records (Parent - Child Relationship)

2009-10-15 Thread ashokcz

thanks Avlesh for your reply.
ya even i had that idea .
but the problem is project data could change very rapdily.
so in that case i will end up changing the associated user details .
say i have just 100 Project records but 1,00,000 user records .
then changing one project record means changing associated all user records
.may be it will go to 1000's.
so any idea of how to do it ??
or any suggestions for that ??


Avlesh Singh wrote:
 

 but is there a way where we can store user records separately and project
 records separately. and jut give the link in solr ?? like mentioned below
 and still making it
 searchable and facetable ??

 With single core, unfortunately not.
 
 Denormalizing data for storage and searches is a regular practice in Solr.
 It might not sound proper if you try to do this with heavily normalized
 data but there nothing wrong about it.
 
 To be specific, in your case, the fields to facet and search upon are
 designed correctly. My understanding is that you need the relationships to
 be preserved only for display. Right? If yes, then you can always create
 an
 untokenized field, say string, and store all the project specific data in
 some delimited format. e.g. in your case -
 projectName$$projectBU$$projectLocation etc. This data can be interpreted
 in
 your application to convert it back into a proper relational data
 structure
 for each document in the result.
 
 Cheers
 Avlesh
 
 On Thu, Oct 15, 2009 at 9:57 AM, ashokcz ashokkumar.gane...@tcs.com
 wrote:
 

 Hi All ,
 I have a specific requirement of storing multiple type of records. but
 dont
 know how to do it .
 First let me tell the requirement.
 I have a table called user table and a user can be mapped to multiple
 projects.
 User table details are User Name , User Id , address , and other details
 .
 I have stored them in solr but now the mapping between user and project
 has
 to be stored .
 Project table have (project name , location , business unit ,etc)

 I can still go ahead and store user has single record with project
 details
 as indvidual fields , like
 UserId:user1
 UserAddress: india
 ProjectNames: project1,project2
 ProjectBU: retail , finance
 ProjectLocation:UK,US

 Here i will search in fields like UserId , ProjectBU ,ProjectLocation and
 have made UserAddress, ProjectLocation as facets


 but is there a way where we can store user records separately and project
 records separately .
 and jut give the link in solr ?? like mentioned below and still making it
 searchable and facetable ??

 User Details
 =
 UserId:user1
 UserAddress: india
 ProjectId:1,2

 Project Details
 ==
 ProjectId:1
 ProjectNames: project1
 ProjectBU: retail
 ProjectLocation:UK

 ProjectId:2
 ProjectNames: project2
 ProjectBU:finance
 ProjectLocation:US


 --
 View this message in context:
 http://www.nabble.com/storing-multiple-type-of-records-%28Parent---Child-Relationship%29-tp25902894p25902894.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/storing-multiple-type-of-records-%28Parent---Child-Relationship%29-tp25902894p25903679.html
Sent from the Solr - User mailing list archive at Nabble.com.



storing multiple type of records (Parent - Child Relationship)

2009-10-14 Thread ashokcz

Hi All ,
I have a specific requirement of storing multiple type of records. but dont
know how to do it .
First let me tell the requirement.
I have a table called user table and a user can be mapped to multiple
projects.
User table details are User Name , User Id , address , and other details .
I have stored them in solr but now the mapping between user and project has
to be stored .
Project table have (project name , location , business unit ,etc)

I can still go ahead and store user has single record with project details
as indvidual fields , like
UserId:user1 
UserAddress: india
ProjectNames: project1,project2
ProjectBU: retail , finance
ProjectLocation:UK,US

Here i will search in fields like UserId , ProjectBU ,ProjectLocation and
have made UserAddress, ProjectLocation as facets


but is there a way where we can store user records separately and project
records separately .
and jut give the link in solr ?? like mentioned below and still making it
searchable and facetable ??

User Details
=
UserId:user1 
UserAddress: india
ProjectId:1,2

Project Details
==
ProjectId:1
ProjectNames: project1
ProjectBU: retail
ProjectLocation:UK

ProjectId:2
ProjectNames: project2
ProjectBU:finance
ProjectLocation:US


-- 
View this message in context: 
http://www.nabble.com/storing-multiple-type-of-records-%28Parent---Child-Relationship%29-tp25902894p25902894.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to query a parent child relationship returning result set of parents?

2006-12-12 Thread Chris Hostetter
: Given this type of layout, how would I go about querying and returning
: a list of blogs which contain text in either the blog content or any
: of the comments' content?

a big issue is how timely do your comments have to show up in the index
... for some people an acceptible tradeoff is that new/edited blogs get
sent to the index immediately, but a cron runing at fixed regular
intervals indexes the comments ... in that approach your first idea is
usually the most straight forward...

: 1) aggregate comment content into the blog content index, allowing me
: to query directly on the blog.  However we are expecting the site to


A hybrid of your second and third suggestions is a much more involved
approach that might also work...

: 2) Use facets to get a list of parent items and issue an additional
: query (or hit the database) to pull in the parent content.  Again,
...
: 3) Plug into the solr code and implement a custom request handler,
: HitCollector, or ...?  I've spent some time digging into the solr code
: and I don't see any obvious place to plug this type of functionality

You can do pretty much anything you want in a custom request handler, but
i must admit that off the top of my head i can't think of any elegant way
to solve your problem.

Most people i know are happy with option #1 :)

-Hoss



Re: How to query a parent child relationship returning result set of parents?

2006-12-12 Thread Eric Van Dewoestine

You can do pretty much anything you want in a custom request handler, but
i must admit that off the top of my head i can't think of any elegant way
to solve your problem.

Most people i know are happy with option #1 :)

-Hoss


I appreciate the input Hoss.  Unfortunately, I don't see option 1
working for us give the number of comments we expect our site to
generate.  If solr had some sort of append command to only index
appended content, then this may be a more viable solution. However,
I'm afraid of the performance impact that will result from re-index
the parent content and all child content every time a new child is
added.

It looks as though option 3 is the 'proper' solution and it's just a
matter of determining what and how to plug it in.  I've seen a couple
topics on the lucene mailing list which seem promising, so now I just
have to figure out how to fit that into the solr environment.

If you or anyone else have any more suggestions, tips, etc. I'd
appreciate the help as I'm a bit time constrained.

Thanks again for the response.

--
eric


How to query a parent child relationship returning result set of parents?

2006-12-11 Thread Eric Van Dewoestine

We are currently using solr to index various types of content in our
system, several of which allow users to comment on.  What we would
like to do is issue a query on the top level content which also
searches the attached comments but only returns unique top level
documents as results, while still maintaining the option to search and
return comments as an alternative type of search for the user.

The simplest example would probably be that of a blog.  The blog could
be indexed as follows:

id: blog_intId
title: blog title
content: blog content

And the associated comments:

id: comment_intId
title: comment title
content: comment content
parentId: blog_intId

Given this type of layout, how would I go about querying and returning
a list of blogs which contain text in either the blog content or any
of the comments' content?

The only solutions I can come up with would be to:
1) aggregate comment content into the blog content index, allowing me
to query directly on the blog.  However we are expecting the site to
generate many comments, along the lines of hundreds and possibly
thousands.  This also has the downside of requiring duplicate content
in the index if we want to still permit users to search on and return
comments.

2) Use facets to get a list of parent items and issue an additional
query (or hit the database) to pull in the parent content.  Again,
this isn't an ideal solution since we would have to page the results
ourselves since solr's facet parameters don't support an offset.  This
possibly negates any optimizations solr may have for paging regular
queries.  Also, it forces us to issue a second round trip to either
solr or the database to get summary content to display in the search
results list.  It also seems like a poor use case for the facet
functionality in general.

3) Plug into the solr code and implement a custom request handler,
HitCollector, or ...?  I've spent some time digging into the solr code
and I don't see any obvious place to plug this type of functionality
in.  A major concern of mine is performance as well, so I want to
ensure that I can get at and modify the results prior to solr loading
any unnecessary content into memory.

Any thoughts on this are very appreciated.  Any kind of kick start,
pointer, or places to dig into would be very helpful.

--
eric