[ANNOUNCE] Apache PyLucene 3.1.0

2011-04-07 Thread Andi Vajda
I am pleased to announce the availability of Apache PyLucene 3.1.0. Apache PyLucene, a subproject of Apache Lucene, is a Python extension for accessing Apache Lucene Core. Its goal is to allow you to use Lucene's text indexing and searching capabilities from Python. It is API compatible with

Re: [ANNOUNCE] Apache PyLucene 3.1.0

2011-04-07 Thread darren
Congrats Andi. A truly awesome project. On Thu, 7 Apr 2011 20:02:22 -0700 (PDT), Andi Vajda va...@apache.org wrote: I am pleased to announce the availability of Apache PyLucene 3.1.0. Apache PyLucene, a subproject of Apache Lucene, is a Python extension for accessing Apache Lucene Core. Its

[jira] [Commented] (LUCENE-2959) [GSoC] Implementing State of the Art Ranking for Lucene

2011-04-07 Thread David Mark Nemeskey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016741#comment-13016741 ] David Mark Nemeskey commented on LUCENE-2959: - Thanks Robert, that would be

Indexing data with Trade Mark Symbol

2011-04-07 Thread mechravi25
Hi, Has anyone indexed the data with Trade Mark symbol??...when i tried to index, the data appears as below... I want to see the Indexed data with TM symbol Indexed Data: 79797 - Siebel Research– AI Fund, 79797 - Siebel Research– AI Fund,l Original Data: 79797 - Siebel

[jira] [Commented] (SOLR-2438) Case Insensitive Search for Wildcard Queries

2011-04-07 Thread Peter Sturge (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016770#comment-13016770 ] Peter Sturge commented on SOLR-2438: As I mentioned above, the approach is a little bit

[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2011-04-07 Thread Liwei (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016784#comment-13016784 ] Liwei commented on LUCENE-2878: --- What should I do for it? Allow Scorer to expose

[jira] [Commented] (LUCENE-2574) Optimize copies between IndexInput and Output

2011-04-07 Thread Matthias Seidel (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016794#comment-13016794 ] Matthias Seidel commented on LUCENE-2574: - Uhm, what happened to these

[jira] [Created] (SOLR-2460) Some European characters cannot be parsed correctly for some PDFs

2011-04-07 Thread JIRA
Some European characters cannot be parsed correctly for some PDFs - Key: SOLR-2460 URL: https://issues.apache.org/jira/browse/SOLR-2460 Project: Solr Issue Type: Bug

[jira] [Commented] (LUCENE-2574) Optimize copies between IndexInput and Output

2011-04-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016804#comment-13016804 ] Robert Muir commented on LUCENE-2574: - I removed them after they caused index

[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-07 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016805#comment-13016805 ] Dawid Weiss commented on SOLR-2378: --- Well spotted, Robert -- indeed, three-byte

[jira] [Commented] (LUCENE-2574) Optimize copies between IndexInput and Output

2011-04-07 Thread Matthias Seidel (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016807#comment-13016807 ] Matthias Seidel commented on LUCENE-2574: - Oh, sorry to here that. Was looking

[jira] [Commented] (LUCENE-2574) Optimize copies between IndexInput and Output

2011-04-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016808#comment-13016808 ] Robert Muir commented on LUCENE-2574: - See LUCENE-2637 for more discussion.

[jira] [Commented] (LUCENE-2574) Optimize copies between IndexInput and Output

2011-04-07 Thread Matthias Seidel (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016816#comment-13016816 ] Matthias Seidel commented on LUCENE-2574: - Ok, didn't know that. Thanks for the

[jira] [Updated] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-07 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-2378: -- Attachment: SOLR-2378.patch Updated patch: - fixed a bug with unicode codepoints [rmuir] - added

[jira] [Updated] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-07 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-2378: -- Attachment: (was: SOLR-2378.patch) FST-based Lookup (suggestions) for prefix matches.

[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-07 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016820#comment-13016820 ] Dawid Weiss commented on SOLR-2378: --- Ok, updated patch. The only thing I would like to

[jira] [Commented] (SOLR-2193) Re-architect Update Handler

2011-04-07 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016862#comment-13016862 ] Mark Miller commented on SOLR-2193: --- I've started experimenting with some simple Lucene

[jira] [Commented] (SOLR-2193) Re-architect Update Handler

2011-04-07 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016874#comment-13016874 ] Yonik Seeley commented on SOLR-2193: NRT finally... sweet! I wonder how this should

[jira] [Commented] (SOLR-2193) Re-architect Update Handler

2011-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016875#comment-13016875 ] Michael McCandless commented on SOLR-2193: -- Fabulous!! Elimination of the

[jira] [Commented] (LUCENE-2308) Separately specify a field's type

2011-04-07 Thread Nikola Tankovic (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13017015#comment-13017015 ] Nikola Tankovic commented on LUCENE-2308: - Thank you Michael! I'll make some

[jira] [Updated] (SOLR-2443) Solr DocValues should have objectVal(int doc)

2011-04-07 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2443: --- Attachment: SOLR-2443.patch Here's a patch that adds objectVal(), exists(), bytesVal(), fixes some

[jira] [Commented] (SOLR-2443) Solr DocValues should have objectVal(int doc)

2011-04-07 Thread Ryan McKinley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13017075#comment-13017075 ] Ryan McKinley commented on SOLR-2443: - I like FloatDocValues -- helpes get rid of the

Re: [VOTE] Release PyLucene 3.1.0

2011-04-07 Thread Andi Vajda
On Sat, 2 Apr 2011, Andi Vajda wrote: The PyLucene 3.1.0-1 release closely tracking the recent release of Lucene Java 3.1.0 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at:

[jira] [Commented] (LUCENE-2956) Support updateDocument() with DWPTs

2011-04-07 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13017084#comment-13017084 ] Jason Rutherglen commented on LUCENE-2956: -- What is the status of this one? If

[jira] [Commented] (LUCENE-2956) Support updateDocument() with DWPTs

2011-04-07 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13017094#comment-13017094 ] Simon Willnauer commented on LUCENE-2956: - Jason I am working on this. First

[jira] [Created] (SOLR-2461) Make QuerySenderListener Public

2011-04-07 Thread Giovanni Fernandez-Kincade (JIRA)
Make QuerySenderListener Public --- Key: SOLR-2461 URL: https://issues.apache.org/jira/browse/SOLR-2461 Project: Solr Issue Type: Improvement Reporter: Giovanni Fernandez-Kincade Priority:

character escapes in source? ... was: Re: Eclipse: Invalid character constant

2011-04-07 Thread Chris Hostetter
replying to dev... : in eclipse you need to set your project's character encoding to UTF-8. ... : Some language specific classes like GermanLightStemmer has invalid : character : compiler errors for code like: : switch(s[i]) { : case 'ä': : case 'à ': :

RE: character escapes in source? ... was: Re: Eclipse: Invalid character constant

2011-04-07 Thread Steven A Rowe
+1 I took an all-of-the-above approach, including the Unicode character description, for the ASCIIFoldingFilter-based stuff. E.g. from the mapping file http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/conf/mapping-FoldToASCII.txt?view=markup: # Ä [LATIN CAPITAL LETTER

Re: character escapes in source? ... was: Re: Eclipse: Invalid character constant

2011-04-07 Thread Robert Muir
On Thu, Apr 7, 2011 at 4:27 PM, Chris Hostetter hossman_luc...@fucit.org wrote: replying to dev... : in eclipse you need to set your project's character encoding to UTF-8.        ... : Some language specific classes like GermanLightStemmer has invalid : character : compiler errors for

Re: character escapes in source? ... was: Re: Eclipse: Invalid character constant

2011-04-07 Thread Robert Muir
On Thu, Apr 7, 2011 at 6:48 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : -1. These files should be readable, for maintaining, debugging and : knowing whats going on. Readability is my main concern ... i don't know (and frequently can't tell) the differnece between a lot of non ascii

[jira] [Updated] (SOLR-2335) FunctionQParser can't handle fieldnames containing whitespace

2011-04-07 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-2335: --- Attachment: SOLR-2335.patch updated patch, includes tests showing the parser working -- even with external

Re: Google Summer Code 2011 participation

2011-04-07 Thread Minh Doan
Hi forks, Receiving a bunch of emails recently about GSOC, I really want to join but it seems like I'm not eligible to do even though I used to be a PhD student, and currently on leave (I will be probably back soon). I really want to contribute to lucene to implement some of my ideas. Can I have

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6842 - Failure

2011-04-07 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6842/ 1 tests failed. REGRESSION: org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at

[HUDSON] Lucene-trunk - Build # 1523 - Failure

2011-04-07 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1523/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestNRTThreads.testNRTThreads Error Message: Some threads threw uncaught exceptions! Stack Trace: junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!