Re: Lucene Newbie Question

2013-12-08 Thread Furkan KAMACI
Does numFound different for that two queries or not?

8 Aralık 2013 Pazar tarihinde Ted Goldstein tedgoldst...@gmail.com adlı
kullanıcı şöyle yazdı:
 I am new to Lucene and have begun experimenting. I've loaded both the
example books.csv and the various example electronic components documents.
 I then do a variety of queries.
 Quering  http://su2c-dev.ucsc.edu:8983/solr/select?q=name:A*  returns
both book entries and electronic component entries.  Buthttp://
su2c-dev.ucsc.edu:8983/solr/select?q=name:* only returns book entries.
 This is non-intutive to me that a broader query should return only one
document type.
 Why is that?

 Thanks,
 Ted




 -
 To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-user-h...@lucene.apache.org




Re: Lucene Newbie Question

2013-12-08 Thread Michael Sokolov

On 12/8/2013 12:03 PM, Ted Goldstein wrote:

I am new to Lucene and have begun experimenting. I've loaded both the example 
books.csv and the various example electronic components documents.  I then do a 
variety of queries.
Quering  http://su2c-dev.ucsc.edu:8983/solr/select?q=name:A*  returns both book 
entries and electronic component entries.  
Buthttp://su2c-dev.ucsc.edu:8983/solr/select?q=name:* only returns book 
entries.  This is non-intutive to me that a broader query should return only 
one document type.
Why is that?

Thanks,
Ted




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Wild guess, since you really didn't tell us much about your setup: are 
there more entries on another page in the solr admin query tool?  I 
think this may have been what Furkan was hinting at with his question 
about numDocs.  Also -- this is probably more of a question for the 
solr-user mailing list since you seem to be using solr to do the querying.


-Mike

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: lucene newbie question

2006-10-02 Thread Erik Hatcher


On Oct 2, 2006, at 2:08 PM, Los Morales wrote:
I'm new to Lucene and IR in general.  I'm a bit confused on the  
concept of fields.  From what I've read, a field does not have to  
be indexed but its value can be stored in an index.  Likewise a  
field can be indexed but its value is not stored in an index.  Now  
how can a field be searchable when its value is not stored in the  
index and vice-versa?  Again, I'm new to the Index/Search  
paradigm.  Thanks in advanced.


Consider the index in the back of a book.  You could tear that out  
and still use it to tell what page something is on, but you have no  
actual content in hand.  When a field is tokenized (and therefore  
implicitly indexed), it is run through the specified Analyzer and the  
terms emitted are indexed, but the original text may or may not also  
be stored in the index.


Make sense?

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: lucene newbie question

2006-10-02 Thread Doron Cohen
SSN actually is a common situation.

Assume you have a (relational) database with a table of products with three
columns :
- SSN, which is also a primary key for that table,
- DESCRIPTION, which has free text (i.e. unformatted text) describing the
product.
- OTHER - additional info.
Also assume you want to allow users of your application to search a product
by its description. For each product found, you intend to fetch the data on
that product from the database and display it to the users.

This can be done in the following setup:
Create a Lucene index with two fields:
- ssn - stored, but not indexed
- description - tokenized (hence indexed) but not stored.

Now the application would send the user query to Lucene, using the
description field. For each document found, the application would fetch its
ssn (which is available from the Lucene index since it was stored). Using
this ssn, the application would fetch all sorts of data on that product and
display it to the user.

There are other possible designs of course - you may want to have
additional data in the Lucene index, but this hopefully just gives the
feeling how different fields with different settings are used in an
application.

I think you would find LIA (Lucene In Action book) very useful.

Los Morales [EMAIL PROTECTED] wrote on 02/10/2006 11:46:45:

 Hi Erik,

 Thanks for the response.

 Consider the index in the back of a book.  You could tear that out  and
 still use it to tell what page something is on, but you have no  actual
 content in hand.
 So, I guess what I'm having a hard time trying to figure out is, what's
the
 point of having an index when you can't search/retrieve the contents of a

 field in the index since it is not stored?  Isn't the whole point of
having
 an index is to be able to search and retrieve the contents efficiently?

 Basically I'm not sure the points of UnIndexed and UnStored fields types.

 Say I use a field type unindexed for my SSN.  I know its stored in the
 index but how am I suppose to retrieve it?
 As for the unstored, its like the scenario I described above... I see the

 fields in the index but I won't be able to search/retrieve it since I
don't
 have the contents.  The text field type makes sense to me (with data
being
 a String), as well as the type keyword.

 Is there a scenario or scenarios you can describe where
Unindexed/Unstored
 will be useful?  Thanks in advanced!

 -los


 From: Erik Hatcher [EMAIL PROTECTED]
 Reply-To: java-user@lucene.apache.org
 To: java-user@lucene.apache.org
 Subject: Re: lucene newbie question
 Date: Mon, 2 Oct 2006 14:12:25 -0400
 
 
 On Oct 2, 2006, at 2:08 PM, Los Morales wrote:
 I'm new to Lucene and IR in general.  I'm a bit confused on the
concept
 of fields.  From what I've read, a field does not have to  be indexed
but
 its value can be stored in an index.  Likewise a  field can be indexed
but
 its value is not stored in an index.  Now  how can a field be
searchable
 when its value is not stored in the  index and vice-versa?  Again, I'm
new
 to the Index/Search  paradigm.  Thanks in advanced.
 
 Consider the index in the back of a book.  You could tear that out  and
 still use it to tell what page something is on, but you have no  actual
 content in hand.  When a field is tokenized (and therefore  implicitly
 indexed), it is run through the specified Analyzer and the  terms
emitted
 are indexed, but the original text may or may not also  be stored in the

 index.
 
 Make sense?
 
Erik
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 

 _
 Be seen and heard with Windows Live Messenger and Microsoft LifeCams
 http://clk.atdmt.com/MSN/go/msnnkwme002001msn/direct/01/?
 href=http://www.microsoft.com/hardware/digitalcommunication/default.
 mspx?locale=en-ussource=hmtagline


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: lucene newbie question

2006-10-02 Thread Erick Erickson

Another Erick (note the correct spelling G). See below..

On 10/2/06, Los Morales [EMAIL PROTECTED] wrote:


Hi Erik,

Thanks for the response.

Consider the index in the back of a book.  You could tear that out  and
still use it to tell what page something is on, but you have no  actual
content in hand.
So, I guess what I'm having a hard time trying to figure out is, what's
the
point of having an index when you can't search/retrieve the contents of a
field in the index since it is not stored?  Isn't the whole point of
having
an index is to be able to search and retrieve the contents efficiently?



Your confusion here, I think, is that you CAN search on an unstored field.
Consider a book. I want to show the user the titles of the most-relevant
books. If I store the text of the entire book, it bloats the size of the
index markedly. So, I index the text but do NOT store it. Now I can show my
titles in relevancy order (when searched over the entire text), but don't
have to pay the penalty size-wise. What I can't do in this case is
reconstruct the book from the index because I didn't store the text. But I
can search it, which is what my app requires.


Basically I'm not sure the points of UnIndexed and UnStored fields types.

Say I use a field type unindexed for my SSN.  I know its stored in the
index but how am I suppose to retrieve it?



You'd search on what you *have* indexed, get the doc (from the index), and
then read the field. Something like

String s = Hits.doc(52).get(SSN);

I'm doing this now since we have images stored with internal IDs on a
separate file system. I *never* care to allow the user to search by our
internal ID number. So I index the caption, and STORE but do not INDEX the
internal ID. We provide a page full of links (in relevancy order) and when
the user clicks on one, use the stored internal ID to fetch the right image.


As for the unstored, its like the scenario I described above... I see the

fields in the index but I won't be able to search/retrieve it since I
don't
have the contents.  The text field type makes sense to me (with data
being
a String), as well as the type keyword.

Is there a scenario or scenarios you can describe where Unindexed/Unstored
will be useful?  Thanks in advanced!



Again, you can search unstored fields. You just can't reconstruct the input
with 100% fidelity (things like stop words will be missing, and any funky
games you played during indexing will mess up an attempt to reconstruct the
data).

Hope this helps.
Erick


-los



From: Erik Hatcher [EMAIL PROTECTED]
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: lucene newbie question
Date: Mon, 2 Oct 2006 14:12:25 -0400


On Oct 2, 2006, at 2:08 PM, Los Morales wrote:
I'm new to Lucene and IR in general.  I'm a bit confused on the  concept
of fields.  From what I've read, a field does not have to  be indexed
but
its value can be stored in an index.  Likewise a  field can be indexed
but
its value is not stored in an index.  Now  how can a field be searchable
when its value is not stored in the  index and vice-versa?  Again, I'm
new
to the Index/Search  paradigm.  Thanks in advanced.

Consider the index in the back of a book.  You could tear that out  and
still use it to tell what page something is on, but you have no  actual
content in hand.  When a field is tokenized (and therefore  implicitly
indexed), it is run through the specified Analyzer and the  terms emitted
are indexed, but the original text may or may not also  be stored in the
index.

Make sense?

   Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


_
Be seen and heard with Windows Live Messenger and Microsoft LifeCams

http://clk.atdmt.com/MSN/go/msnnkwme002001msn/direct/01/?href=http://www.microsoft.com/hardware/digitalcommunication/default.mspx?locale=en-ussource=hmtagline


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]