Hi Mike,
The first thing that comes to mind is to run a query for each document
type (assuming that you have a field that stores the type) and qualify
the document type: for example type:pdf. Then you would have to write
something to combine the query results drawing an equal number of hits
Hi All,
I'll explain what I'm working on, and then I'll ask my two questions.
I'm working on the issue
https://issues.apache.org/jira/browse/SOLR-380 which is a feature
request that allows one to index a "Structured Document" which is
anything that can be represented by XML in order to pr
Hi All,
I'll explain what I'm working on, and then I'll ask my two questions.
I'm working on the issue
https://issues.apache.org/jira/browse/SOLR-380 which is a feature
request that allows one to index a "Structured Document" which is
anything that can be represented by XML in order to
Hi Grant,
Thanks for your response!
Taking a closer look at the TokenFilter(s) that causes my problem
with the Payload are all from org.apache.solr.analysis rather than
org.apache.lucene.analysis. I had originally thought that all the
TokenFilters available through Solr's TokenFilterFa
I apologize for cross-posting but I believe both Solr and Lucene users
and developers should be concerned with this. I am not aware of a
better way to reach both communities.
In this email I'm looking for comments on:
* Do TokenFilters belong in the Solr code base at all?
* How to deal
Hi Maurizio,
I'm replying in java-user because I believe this is the appropriate
place for a question like this.
All the patches that I have encountered (including this one) are
usually applied at the root. One should download the source code from
http://svn.apache.org/repos/asf/lucen
Hi Martin,
Take a look at what I've done with SOLR-380
(https://issues.apache.org/jira/browse/SOLR-380). It might solve your
problem, or at least give you a good starting point.
Tricia
Michael McCandless wrote:
I think you could use payloads (= arbitrary/opaque byte[]) for this?
You ca
Martin Owens wrote:
Dear Lucene Users and Tricia Williams,
The way we're operating our lucene index is one where we index all the
terms but not store the text. From your SOLR-380 patch example Tricia I
was able to get a very good idea of how to set things up. Historically I
have used
Hi,
Take a look at what I've done with SOLR-380
(https://issues.apache.org/jira/browse/SOLR-380). The part you might
find particularly useful is the Tokenizer.
Tricia
[EMAIL PROTECTED] wrote:
Dear users,
Question on approaches to indexing TEI XML or similar section/subsectioned
files.
When you create a Document by adding Field(s)
(http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html)
consider the last constructor which allows you to specify if the the field
will have its TermVector stored or not stored. Also, Luke has a column in
its document view wh
Hi,
I'm wondering why Stored isn't one of the IndexReader.FieldOption(s)?
Stored is created at the same time and place as the other options
(FieldOption.INDEXED and FieldOption.TERMVECTOR) so it doesn't make sense
that it isn't retrieved in the same way.
Tricia
---
Hi All,
I'm using an html form to send a query to an xsp which uses lucene to
search and then returns the results as xml. Perhaps some one has
experienced the problem that I'm currently experiencing. When the query
is parsed org.apache.lucene.queryParser.ParseException is thrown stating
that
Hi James,
A paper was mentioned on this list in the last couple of months which
presents a solution to your sampling problem without having to know the
total results size in advance. The paper
(http://www2005.org/cdrom/docs/p245.pdf) presents two solutions which
utilize a random variable.
Hi,
I'd like to store a HashMap for some extra data to be used when a given
document is retrieved as a Hit for a query. To add an UnIndexed Field
to an index takes only Strings as parameters. Does anyone have any
suggestions on how I might convert the HashMap to a String that is
efficiently r
o a HashMap)
>
> -Original Message-----
> From: Tricia Williams [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, September 20, 2005 3:14 PM
> To: java-user@lucene.apache.org
> Subject: Storing HashMap as an UnIndexed Field
>
> Hi,
>
>I'd like to store a HashMap for some ex
I am finding that TermDocs.freq() method is returning an incorrect value.
I was wondering if anyone else had experienced this problem.
I am using tp = IndexReader.termPositions( queryTerm ) to return a object
which implements TermPositions. I then use tp.skipTo( docid ) to go
directly to the docu
? Is there an obvious work-around so that the frequency
that I receive is correct for my document?
Thank you for your consideration,
Tricia
On Thu, 29 Sep 2005, Tricia Williams wrote:
> I am finding that TermDocs.freq() method is returning an incorrect value.
> I was wondering if anyone el
17 matches
Mail list logo