Re: MaxFieldLength

2016-07-08 Thread Michael McCandless
The default for IndexWriter is no limit, so if that's what you'd like, then
yes, you don't need to use LimitTokenCountFilter.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jul 8, 2016 at 11:58 AM, Siraj Haider  wrote:

> Thanks Mike,
>
> The name LimitTokenCountAnalyzer suggests that it is used to *Limit* the
> token count, so I was thinking that the default now is no limit and we
> might not need to use it as we wanted to increase the field size instead of
> limiting it. Please let me know.
>
>
>
> --
>
> Regards
>
> -Siraj Haider
>
> (212) 306-0154
>
>
>
> *From:* Michael McCandless [mailto:luc...@mikemccandless.com]
> *Sent:* Friday, July 08, 2016 11:56 AM
> *To:* Lucene Users; Siraj Haider
> *Subject:* Re: MaxFieldLength
>
>
>
> This was removed a while back and replaced with LimitTokenCountFilter,
> which you just need to tack onto your analysis chain to get the same
> behavior as before.
>
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
>
> On Fri, Jul 8, 2016 at 11:53 AM, Siraj Haider  wrote:
>
> Hello there,
> What is the default maximum field length in Lucene 6? In Lucene2.9 we use
> IndexWriter.MaxFieldLength to increase the default to 100,000 as we index
> some very large fields. What should be the alternate for that in Lucene 6?
>
> --
> Regards
> -Siraj Haider
> (212) 306-0154
>
>
> 
>
> This electronic mail message and any attachments may contain information
> which is privileged, sensitive and/or otherwise exempt from disclosure
> under applicable law. The information is intended only for the use of the
> individual or entity named as the addressee above. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution (electronic or otherwise) or forwarding of, or the taking of
> any action in reliance on, the contents of this transmission is strictly
> prohibited. If you have received this electronic transmission in error,
> please notify us by telephone, facsimile, or e-mail as noted above to
> arrange for the return of any electronic mail or attachments. Thank You.
>
>
>
> --
>
> This electronic mail message and any attachments may contain information
> which is privileged, sensitive and/or otherwise exempt from disclosure
> under applicable law. The information is intended only for the use of the
> individual or entity named as the addressee above. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution (electronic or otherwise) or forwarding of, or the taking of
> any action in reliance on, the contents of this transmission is strictly
> prohibited. If you have received this electronic transmission in error,
> please notify us by telephone, facsimile, or e-mail as noted above to
> arrange for the return of any electronic mail or attachments. Thank You.
>


Re: Port of Custom value source from v4.10.3 to v6.1.0

2016-07-08 Thread Yonik Seeley
Use getSortedDocValues for a single-valued field, or
getSortedSetDocValues for multi-valued.

-Yonik


On Fri, Jul 8, 2016 at 12:29 PM, paule_lecuyer  wrote:
> Many Thanks Yonik,  I will try that.
>
> For my understanding, what is the difference between SortedSetDocValues
> getSortedSetDocValues(String field) and SortedDocValues
> getSortedDocValues(String field) ?
>
> Paule.
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Upgrade-of-Custom-value-source-code-from-v4-10-3-to-v6-1-0-tp4286236p4286387.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Port of Custom value source from v4.10.3 to v6.1.0

2016-07-08 Thread paule_lecuyer
Many Thanks Yonik,  I will try that.

For my understanding, what is the difference between SortedSetDocValues
getSortedSetDocValues(String field) and SortedDocValues
getSortedDocValues(String field) ?

Paule.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-of-Custom-value-source-code-from-v4-10-3-to-v6-1-0-tp4286236p4286387.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Problems Refactoring a Lucene Index

2016-07-08 Thread Stuart Goldberg
As our software goes through its lifecycle, we sometimes have to alter
existing Lucene indexes. The way I have done that in the past is to open the
existing index for reading, read each Document, modify it and write that
Document to a new index. At the end of the process, I delete the old index
and rename the new index to the old name.

I do not do any tokenizing and use no analyzers.

I recently upgraded from Lucene 3.x to 4.10.4. Now I have the following
problem: Suppose the existing document has 10 fields in it and there's one I
have to modify. I remove that field and re-add it with the new settings.
Then I add the Document in its entirety to the new index. I run into the
following problems:

*   I get Exceptions thrown for the fields I don't even touch. That's
because their FieldType has 'tokenized' set to true and it fails because I
am using no analyzers. 'tokenized' is set to true even though when I
originally added the field to the original index I had 'tokenized' set to
false!

*   I have LongFields that come back with 'indexed' set to false even
though in the original index they were indexed! This makes the new index not
searchable on these fields and hence unusable. 

*   I can't even alter 'indexed' for these LongFields because for some
reason the FieldType instance comes back frozen from the IndexReader. Once
frozen,  you can't alter it. Even if I create a new FieldType, there is no
way to change the FieldType of a Field

It seems the returned FieldType contents are kind of random!

I did see in the Javadoc of IndexReader.document() that field metadata is
not returned and that, in fact, that they should have new kind of object
returned like 'StoredField' so there is no pretense of there being any
metadata.

I thought perhaps I could use FieldInfos. But that class returns the same
bogus metadata.  What then is the purpose of FieldInfos if the info is
bogus?

Am I not understanding something here? This is not very usable. What can I
do to work around this? Is this a Lucene bug? Oversight?



RE: MaxFieldLength

2016-07-08 Thread Siraj Haider
Thanks Mike,
The name LimitTokenCountAnalyzer suggests that it is used to Limit the token 
count, so I was thinking that the default now is no limit and we might not need 
to use it as we wanted to increase the field size instead of limiting it. 
Please let me know.

--
Regards
-Siraj Haider
(212) 306-0154

From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Friday, July 08, 2016 11:56 AM
To: Lucene Users; Siraj Haider
Subject: Re: MaxFieldLength

This was removed a while back and replaced with LimitTokenCountFilter, which 
you just need to tack onto your analysis chain to get the same behavior as 
before.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jul 8, 2016 at 11:53 AM, Siraj Haider 
> wrote:
Hello there,
What is the default maximum field length in Lucene 6? In Lucene2.9 we use 
IndexWriter.MaxFieldLength to increase the default to 100,000 as we index some 
very large fields. What should be the alternate for that in Lucene 6?

--
Regards
-Siraj Haider
(212) 306-0154




This electronic mail message and any attachments may contain information which 
is privileged, sensitive and/or otherwise exempt from disclosure under 
applicable law. The information is intended only for the use of the individual 
or entity named as the addressee above. If you are not the intended recipient, 
you are hereby notified that any disclosure, copying, distribution (electronic 
or otherwise) or forwarding of, or the taking of any action in reliance on, the 
contents of this transmission is strictly prohibited. If you have received this 
electronic transmission in error, please notify us by telephone, facsimile, or 
e-mail as noted above to arrange for the return of any electronic mail or 
attachments. Thank You.




This electronic mail message and any attachments may contain information which 
is privileged, sensitive and/or otherwise exempt from disclosure under 
applicable law. The information is intended only for the use of the individual 
or entity named as the addressee above. If you are not the intended recipient, 
you are hereby notified that any disclosure, copying, distribution (electronic 
or otherwise) or forwarding of, or the taking of any action in reliance on, the 
contents of this transmission is strictly prohibited. If you have received this 
electronic transmission in error, please notify us by telephone, facsimile, or 
e-mail as noted above to arrange for the return of any electronic mail or 
attachments. Thank You.


Re: MaxFieldLength

2016-07-08 Thread Michael McCandless
This was removed a while back and replaced with LimitTokenCountFilter,
which you just need to tack onto your analysis chain to get the same
behavior as before.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jul 8, 2016 at 11:53 AM, Siraj Haider  wrote:

> Hello there,
> What is the default maximum field length in Lucene 6? In Lucene2.9 we use
> IndexWriter.MaxFieldLength to increase the default to 100,000 as we index
> some very large fields. What should be the alternate for that in Lucene 6?
>
> --
> Regards
> -Siraj Haider
> (212) 306-0154
>
>
> 
>
> This electronic mail message and any attachments may contain information
> which is privileged, sensitive and/or otherwise exempt from disclosure
> under applicable law. The information is intended only for the use of the
> individual or entity named as the addressee above. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution (electronic or otherwise) or forwarding of, or the taking of
> any action in reliance on, the contents of this transmission is strictly
> prohibited. If you have received this electronic transmission in error,
> please notify us by telephone, facsimile, or e-mail as noted above to
> arrange for the return of any electronic mail or attachments. Thank You.
>


MaxFieldLength

2016-07-08 Thread Siraj Haider
Hello there,
What is the default maximum field length in Lucene 6? In Lucene2.9 we use 
IndexWriter.MaxFieldLength to increase the default to 100,000 as we index some 
very large fields. What should be the alternate for that in Lucene 6?

--
Regards
-Siraj Haider
(212) 306-0154




This electronic mail message and any attachments may contain information which 
is privileged, sensitive and/or otherwise exempt from disclosure under 
applicable law. The information is intended only for the use of the individual 
or entity named as the addressee above. If you are not the intended recipient, 
you are hereby notified that any disclosure, copying, distribution (electronic 
or otherwise) or forwarding of, or the taking of any action in reliance on, the 
contents of this transmission is strictly prohibited. If you have received this 
electronic transmission in error, please notify us by telephone, facsimile, or 
e-mail as noted above to arrange for the return of any electronic mail or 
attachments. Thank You.


Re: Port of Custom value source from v4.10.3 to v6.1.0

2016-07-08 Thread Yonik Seeley
Use the docValues interface by calling getSortedSetDocValues on the
leaf reader.  That will either
1) use real docValues if you have indexed them
2) use the FieldCache to uninvert an indexed field and make it look
like docValues.

-Yonik


On Thu, Jul 7, 2016 at 1:33 PM, paule_lecuyer  wrote:
> Hi all,
> I wrote some time ago a ValueSourceParser + ValueSource to allow using
> results produced by an external system as a facet query :
> - in solrconfig.xml : added my parser :
> http://lucene.472066.n3.nabble.com/Port-of-Custom-value-source-from-v4-10-3-to-v6-1-0-tp4286236.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org