Re: Searching across multivalued fields

2009-06-18 Thread MilkDud


Michael Ludwig-4 wrote:
 
 MilkDud schrieb:
 What do you expect the user to enter?
 
 * dream theater innocence faded - certainly wrong
 * dream theater innocence faded - much better
 
 Most likely they would just enter dream theater innocence faded, no
 quotes.  Without any quotes around any fields, which is a large cause of
 the problem.  Now if i index on the track level, than all those words
 would have to show up in just one track (including the album, artist, and
 track name), which is expected.  If i index on the album level however,
 now, those words just need to show up anywhere throughout the entire
 album.
 
 So, while it will match dream theater - innocence faded, it will also
 match an album that has all the words dream theater innocence faded
 mentioned across all tracks, which for small queries can be very common.
 
 Basically, I'm looking for a way to say match all the words in the search
 query across the artist, album, and track name, but only looking at one
 track (a multivalued field) at a time given a query without any quotes. 
 Does that make sense at all?
 
 That is why I was leaning towards the track level index, such as:
 id, artist, album, track (all single valued)
 
 as it does solve that problem, but then I have to deal with duplicate data
 being put in the artist/album fields (and a bunch of other fields).  Also,
 indexing on the album level poses further complications given that I also
 store the location to a track preview clip next to each track and keeping
 track of sets of data like that in solr is not really feasible.
 
 

-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24099668.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching across multivalued fields

2009-06-17 Thread MilkDud

Michael,

That part I understand and is what I have now.  It's the fact that since
tracks is multivalued, and i search for a track love me, i will also get
back artists that have the words love and me in separate tracks.  Now with a
phrase query with a small ps and a large posIncGap that could word.  But
then I lose the ability to search for artist and track name together.

-Jason


Michael Ludwig-4 wrote:
 
 MilkDud schrieb:
 
 To be more specific, I'm indexing a collection of music albums that
 have multiple tracks and an album artist.  So, some searches will
 contain both the artist name and the track name.  I can't make this a
 single phrase query as it is indexed across two separate fields.
 
 Use the DisMaxRequestHandler and specify all fields you want to use in
 your query in the qf parameter.
 
!-- qf = query fields: list of fields with boost factor --
str name=qf artist^3 album^2 track^1 /str
 
 http://wiki.apache.org/solr/DisMaxRequestHandler
 
 Michael Ludwig
 
 

-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24074933.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching across multivalued fields

2009-06-17 Thread MilkDud

Yea, not using stopwords at all.  I do have tracks specified in the pf param
along with a few other fields.  That said, with a phrase query I lose the
ability to search for an artist and track combined.  Two solutions i've
thought of include indexing at the track level only (right now i have
separate documents at the track, artist, and album level) or having a field
that contains both the artist and track name concatenated, allowing for
phrase queries containing bother artist and track names.


Michael Ludwig-4 wrote:
 
 MilkDud schrieb:

 That part I understand and is what I have now.  It's the fact that
 since tracks is multivalued, and i search for a track love me, i
 will also get back artists that have the words love and me in separate
 tracks.
 
 Jason,
 
 are you sure me isn't in a stopword list used to analyze your query?
 Append debugQuery=true to find out whether by any chance it is removed
 from your query phrase. In that case, your phrase won't survive parsing,
 and all you'll be left with is love :-)
 
 But I guess there are quite a lot of love titles :-)
 
 Now with a phrase query with a small ps and a large posIncGap that
 could word.  But then I lose the ability to search for artist and
 track name together.
 
 Another thing, are you sure you have enabled pf for track?
 
 Michael Ludwig
 
 

-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24076620.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching across multivalued fields

2009-06-17 Thread MilkDud

Sure.  To be clear, I am actually revamping an existing index, that I've
found numerous problems with so far.  Basically, what I am trying to do is
index a collection of music for an online music store.  This contains
information on the track, album, and artist levels.  These are all different
object types in the same schema and it does contain a lot of redundant
information.  For example, a track will have its own listing, but will show
up again in the album listing and the artist listing for the objects that
own that track.  There are reasons it is done this way as we search/display
across the three differently.  That said, I have thought of ways of just
indexing tracks and maintaining all the relevant information, but that seems
to introduce its own issues.

Thanks,
Jason


Erick Erickson wrote:
 
 H. Could you expand a bit more on the problem you're trying
 to solve? The index organization you're hinting at seems close enough
 to a set of database tables to make me wonder if you're using an
 inappropriate index structure given the problem you want to solve.
 
 Not that I know enough about your problem/solution to have a valid
 opinion, but there's at least a chance that this is an XY problem
 
 Best
 Erick
 
 On Wed, Jun 17, 2009 at 4:52 PM, MilkDud jf...@limewire.com wrote:
 

 Yea, not using stopwords at all.  I do have tracks specified in the pf
 param
 along with a few other fields.  That said, with a phrase query I lose the
 ability to search for an artist and track combined.  Two solutions i've
 thought of include indexing at the track level only (right now i have
 separate documents at the track, artist, and album level) or having a
 field
 that contains both the artist and track name concatenated, allowing for
 phrase queries containing bother artist and track names.


 Michael Ludwig-4 wrote:
 
  MilkDud schrieb:
 
  That part I understand and is what I have now.  It's the fact that
  since tracks is multivalued, and i search for a track love me, i
  will also get back artists that have the words love and me in separate
  tracks.
 
  Jason,
 
  are you sure me isn't in a stopword list used to analyze your query?
  Append debugQuery=true to find out whether by any chance it is removed
  from your query phrase. In that case, your phrase won't survive
 parsing,
  and all you'll be left with is love :-)
 
  But I guess there are quite a lot of love titles :-)
 
  Now with a phrase query with a small ps and a large posIncGap that
  could word.  But then I lose the ability to search for artist and
  track name together.
 
  Another thing, are you sure you have enabled pf for track?
 
  Michael Ludwig
 
 

 --
 View this message in context:
 http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24076620.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24077360.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching across multivalued fields

2009-06-17 Thread MilkDud

Ok, so lets suppose i did index across just the album.  Using that index, how
would I be able to handle searches of the form artist name track name.  If
i do the search using a phrase query, this won't match anything because the
artist and track are not in one field (hence my idea of creating a third
concatenated field).  If i make it a non phrase query, itll return albums
that have those words across all the tracks, which is not ideal.  I.e. if
you search for a track titled love me you will get back albums with the
words love and me in different tracks.  Basically, i'd like it to look at
each track individually and if the artist + just one track match all the
search terms, then that counts as a match.  Does that make sense?  If i
index on the track level, that should work, but then i have to store
album/artist info on each track. 


Michael Ludwig-4 wrote:
 
 MilkDud schrieb:
 
 Basically, what I am trying to do is index a collection of music for
 an online music store.  This contains information on the track, album,
 and artist levels.  These are all different object types in the same
 schema and it does contain a lot of redundant information.
 
 What's a document in your case? If I were you, I'd probably organize
 the data so that each album is one document, because that's what you'd
 expect (shopping experience).
 
 For example, a track will have its own listing, but will show up again
 in the album listing and the artist listing for the objects that own
 that track.
 
 Sounds a bit bizarre to me, but then I don't know much about your
 requirements.
 
 There are reasons it is done this way as we search/display across the
 three differently.
 
 Hmm.
 
 That said, I have thought of ways of just indexing tracks and
 maintaining all the relevant information, but that seems to introduce
 its own issues.
 
 An album should be a document and have the following fields (and maybe
 more, if you have more data attached to it):
 
 id - unique, an identifier
 title - album title
 interpret - the musician, possibly multi-valued
 track - every song or whatever, definitely multi-valued
 
 Michael Ludwig
 
 

-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24079492.html
Sent from the Solr - User mailing list archive at Nabble.com.



Searching across multivalued fields

2009-06-16 Thread MilkDud

I'm trying to prevent a search from going across multiple values in a
multivalued field and am running into an issue.  From what I've read, the
standard way to do this is with a positionIncrementGap that is larger than
the ps value.  However, I can't make this a phrase query because there is
another field that has to be searched against.

To be more specific, I'm indexing a collection of music albums that have
multiple tracks and an album artist.  So, some searches will contain both
the artist name and the track name.  I can't make this a single phrase query
as it is indexed across two separate fields.  So a small ps with a large
posIncGap doesn't do anything.  Is there any way to get past this?
-- 
View this message in context: 
http://www.nabble.com/Searching-across-multivalued-fields-tp24056297p24056297.html
Sent from the Solr - User mailing list archive at Nabble.com.