[jira] Commented: (SOLR-1229) deletedPkQuery feature does not work when pk and uniqueKey field do not have the same value

2009-08-06 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739930#action_12739930
 ] 

Lance Norskog commented on SOLR-1229:
-


There are a couple of other features to remove in this code:

1) multiple primary keys
2) deltaImportQuery is created automatically if it is not given in the 
dataconfig.xml file

Do we want to attack all of those in this issue?





 deletedPkQuery feature does not work when pk and uniqueKey field do not have 
 the same value
 ---

 Key: SOLR-1229
 URL: https://issues.apache.org/jira/browse/SOLR-1229
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 1.4

 Attachments: SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, 
 SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, tests.patch


 Problem doing a delta-import such that records marked as deleted in the 
 database are removed from Solr using deletedPkQuery.
 Here's a config I'm using against a mocked test database:
 {code:xml}
 dataConfig
  dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/db/
  document name=tests
entity name=test
pk=board_id
transformer=TemplateTransformer
deletedPkQuery=select board_id from boards where deleted = 'Y'
query=select * from boards where deleted = 'N'
deltaImportQuery=select * from boards where deleted = 'N'
deltaQuery=select * from boards where deleted = 'N'
preImportDeleteQuery=datasource:board
  field column=id template=board-${test.board_id}/
  field column=datasource template=board/
  field column=title /
/entity
  /document
 /dataConfig
 {code}
 Note that the uniqueKey in Solr is the id field.  And its value is a 
 template board-PK.
 I noticed the javadoc comments in DocBuilder#collectDelta it says Note: In 
 our definition, unique key of Solr document is the primary key of the top 
 level entity.  This of course isn't really an appropriate assumption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1229) deletedPkQuery feature does not work when pk and uniqueKey field do not have the same value

2009-08-06 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739948#action_12739948
 ] 

Noble Paul commented on SOLR-1229:
--

bq. Do we want to attack all of those in this issue?

we must remove them . Let us have a separate issue for them

 deletedPkQuery feature does not work when pk and uniqueKey field do not have 
 the same value
 ---

 Key: SOLR-1229
 URL: https://issues.apache.org/jira/browse/SOLR-1229
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 1.4

 Attachments: SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, 
 SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, tests.patch


 Problem doing a delta-import such that records marked as deleted in the 
 database are removed from Solr using deletedPkQuery.
 Here's a config I'm using against a mocked test database:
 {code:xml}
 dataConfig
  dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/db/
  document name=tests
entity name=test
pk=board_id
transformer=TemplateTransformer
deletedPkQuery=select board_id from boards where deleted = 'Y'
query=select * from boards where deleted = 'N'
deltaImportQuery=select * from boards where deleted = 'N'
deltaQuery=select * from boards where deleted = 'N'
preImportDeleteQuery=datasource:board
  field column=id template=board-${test.board_id}/
  field column=datasource template=board/
  field column=title /
/entity
  /document
 /dataConfig
 {code}
 Note that the uniqueKey in Solr is the id field.  And its value is a 
 template board-PK.
 I noticed the javadoc comments in DocBuilder#collectDelta it says Note: In 
 our definition, unique key of Solr document is the primary key of the top 
 level entity.  This of course isn't really an appropriate assumption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Upgrading Lucene

2009-08-06 Thread Grant Ingersoll


On Aug 5, 2009, at 6:07 PM, Mark Miller wrote:


Mark Miller wrote:

Grant Ingersoll wrote:


On Aug 3, 2009, at 8:21 PM, Mark Miller wrote:



4. You cannot instantiate MergePolicy with a no arg constructor  
anymore - so that fails now. I don't have a fix for this at the  
moment.



That sounds like a back compat break ;-)

It was - but they knew it would be and decided it was fine. The  
methods on the class were package private, so it appeared  
reasonable. The class was also labeled as expert and subject to  
sudden change. I guess it was fair game to break - I don't think  
this scenario was thought of, but I would think we can work around  
it. I havn't really thought about it yet myself though.



So this is a bit tricky I guess. The way they handled this in Lucene  
Benchmark is:
writer.setMergePolicy((MergePolicy)  
Class.forName(mergePolicy).getConstructor(new Class[]  
{ IndexWriter.class }).newInstance(new Object[] { writer }));


Now if we handle it the same way, thats fine. But then you can't put  
one of these in solr.home/lib. To do that, you have to use  
SolrResourceLoader.newInstance - which requires a no arg constructor.


There's a newInstance on SolrResourceLoader that can take args, if  
that helps.



I think we can do something like: create an object with  
SolrResourceLoader that loads the MergePolicy - it can use  
Class.forName like above because it will use the Classloader of the  
object that invoked it. But I'd rather not go there if we don't have  
to. This is a pretty advanced plugin, and likely just intended for  
picking between included Lucene impls. Do we want to make sure it  
can still be loaded from solr.home/lib?



We could require that any implementations take a writer.  I think we  
need a better way of taking in arbitrary attributes.  Basically,  
Spring or GUICE or whatever, but that isn't going to happen overnight.




Re: Upgrading Lucene

2009-08-06 Thread Mark Miller

Grant Ingersoll wrote:


On Aug 5, 2009, at 6:07 PM, Mark Miller wrote:


Mark Miller wrote:

Grant Ingersoll wrote:


On Aug 3, 2009, at 8:21 PM, Mark Miller wrote:



4. You cannot instantiate MergePolicy with a no arg constructor 
anymore - so that fails now. I don't have a fix for this at the 
moment.



That sounds like a back compat break ;-)

It was - but they knew it would be and decided it was fine. The 
methods on the class were package private, so it appeared 
reasonable. The class was also labeled as expert and subject to 
sudden change. I guess it was fair game to break - I don't think 
this scenario was thought of, but I would think we can work around 
it. I havn't really thought about it yet myself though.



So this is a bit tricky I guess. The way they handled this in Lucene 
Benchmark is:
writer.setMergePolicy((MergePolicy) 
Class.forName(mergePolicy).getConstructor(new Class[] { 
IndexWriter.class }).newInstance(new Object[] { writer }));


Now if we handle it the same way, thats fine. But then you can't put 
one of these in solr.home/lib. To do that, you have to use 
SolrResourceLoader.newInstance - which requires a no arg constructor.


There's a newInstance on SolrResourceLoader that can take args, if 
that helps.
Ah, nice - I think it does. I kept glossing over it and just saw:   
public Object newInstance(String cname, String ... subpackages)


There is also   public Object newInstance(String cName, String [] 
subPackages, Class[] params, Object[] args)


Thanks - it looks like it should work.




I think we can do something like: create an object with 
SolrResourceLoader that loads the MergePolicy - it can use 
Class.forName like above because it will use the Classloader of the 
object that invoked it. But I'd rather not go there if we don't have 
to. This is a pretty advanced plugin, and likely just intended for 
picking between included Lucene impls. Do we want to make sure it can 
still be loaded from solr.home/lib?



We could require that any implementations take a writer.  I think we 
need a better way of taking in arbitrary attributes.  Basically, 
Spring or GUICE or whatever, but that isn't going to happen overnight.


Right - and we want to upgrade now :) Looks like we have the 
functionality we need though, thanks !


--
- Mark

http://www.lucidimagination.com





[jira] Created: (SOLR-1342) CapitalizationFilterFactory uses incorrect length calculations

2009-08-06 Thread Robert Muir (JIRA)
CapitalizationFilterFactory uses incorrect length calculations
--

 Key: SOLR-1342
 URL: https://issues.apache.org/jira/browse/SOLR-1342
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Reporter: Robert Muir
Priority: Minor


CapitalizationFilterFactory in some cases uses termBuffer.length, which might 
be larger than the actual termBufferLength()

this causes keep words to be evaluated incorrectly, but with the LUCENE-1762 
the bug is exposed, because the default buffer size has changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1342) CapitalizationFilterFactory uses incorrect length calculations

2009-08-06 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-1342:
--

Attachment: SOLR-1342.patch

patch attached, if its not obvious that its a bug i can try to create some test 
case that will show the bug with the old lucene jar

 CapitalizationFilterFactory uses incorrect length calculations
 --

 Key: SOLR-1342
 URL: https://issues.apache.org/jira/browse/SOLR-1342
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Reporter: Robert Muir
Priority: Minor
 Attachments: SOLR-1342.patch


 CapitalizationFilterFactory in some cases uses termBuffer.length, which might 
 be larger than the actual termBufferLength()
 this causes keep words to be evaluated incorrectly, but with the LUCENE-1762 
 the bug is exposed, because the default buffer size has changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-706) Fast auto-complete suggestions

2009-08-06 Thread Ankul Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740128#action_12740128
 ] 

Ankul Garg commented on SOLR-706:
-

I got some benchmarking results comparing lucene's prefix search and 
autocomplete by trie using hashMap at each node (roughly a TST or even better). 
TST works much better than lucene's prefix search. How about using it in Solr?

 Fast auto-complete suggestions
 --

 Key: SOLR-706
 URL: https://issues.apache.org/jira/browse/SOLR-706
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5


 A lot of users have suggested that facet.prefix in Solr is not the most 
 efficient way to implement an auto-complete suggestion feature. A fast 
 in-memory trie like structure has often been suggested instead. This issue 
 aims to incorporate a faster/efficient way to answer auto-complete queries in 
 Solr.
 Refer to the following discussion on solr-dev -- 
 http://markmail.org/message/sjjojrnroo3msugj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-1342) CapitalizationFilterFactory uses incorrect length calculations

2009-08-06 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-1342:
-

Assignee: Mark Miller

 CapitalizationFilterFactory uses incorrect length calculations
 --

 Key: SOLR-1342
 URL: https://issues.apache.org/jira/browse/SOLR-1342
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Reporter: Robert Muir
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1342.patch


 CapitalizationFilterFactory in some cases uses termBuffer.length, which might 
 be larger than the actual termBufferLength()
 this causes keep words to be evaluated incorrectly, but with the LUCENE-1762 
 the bug is exposed, because the default buffer size has changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1342) CapitalizationFilterFactory uses incorrect length calculations

2009-08-06 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-1342:
--

Fix Version/s: 1.4

 CapitalizationFilterFactory uses incorrect length calculations
 --

 Key: SOLR-1342
 URL: https://issues.apache.org/jira/browse/SOLR-1342
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Reporter: Robert Muir
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1342.patch


 CapitalizationFilterFactory in some cases uses termBuffer.length, which might 
 be larger than the actual termBufferLength()
 this causes keep words to be evaluated incorrectly, but with the LUCENE-1762 
 the bug is exposed, because the default buffer size has changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1343) HTMLStripCharFilter

2009-08-06 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1343:
-

Attachment: SOLR-1343.patch

 HTMLStripCharFilter
 ---

 Key: SOLR-1343
 URL: https://issues.apache.org/jira/browse/SOLR-1343
 Project: Solr
  Issue Type: Improvement
  Components: Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 1.4

 Attachments: SOLR-1343.patch


 Introducing HTMLStripCharFilter:
 * move html strip logic from HTMLStripReader to HTMLStripCharFilter
 * make HTMLStripReader depracated
 * make HTMLStrip*TokenizerFactory deprecated

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler

2009-08-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740152#action_12740152
 ] 

Yonik Seeley commented on SOLR-1099:


Finally got around to reviewing the interface for some of this stuff...
there are a number of oddities (things like using the complete text of a field 
as the key or name in a map value, listing the value twice, requiring a 
uniqueKey)... but then I started thinking about who will use this, and maybe 
it's not worth trying to fix it up right now.

And that got me thinking why there are SolrJ classes dedicated to it... and I'm 
not sure that we should take up space for that.

IMO, common things in SolrJ should have easier, more type safe interfaces and 
uncommon, advanced features should be accessed via the generic APIs in order to 
keep the interfaces smaller and more understandable for the general user.

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, 
 SOLR-1099.patch, SOLR-1099.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1340) DocumentAnalysisRequestHandler can't tolerate analysis exceptions

2009-08-06 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-1340.


Resolution: Fixed

committed fix... I can't tell if it's right or not, but I leave out entries 
that produce exceptions.
I also just commented out the tests around the id field - they didn't look 
valid.

 DocumentAnalysisRequestHandler can't tolerate analysis exceptions
 -

 Key: SOLR-1340
 URL: https://issues.apache.org/jira/browse/SOLR-1340
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Yonik Seeley
 Fix For: 1.4


 DocumentAnalysisRequestHandler throws exceptions in 
 testHandleAnalysisRequest() if analysis throws an exception

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1344) MoreLikeThis handler can't handle numeric id

2009-08-06 Thread Yonik Seeley (JIRA)
MoreLikeThis handler can't handle numeric id


 Key: SOLR-1344
 URL: https://issues.apache.org/jira/browse/SOLR-1344
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


MTL fails when uniqueKey is a numeric field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1344) MoreLikeThis handler can't handle numeric id

2009-08-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740194#action_12740194
 ] 

Yonik Seeley commented on SOLR-1344:


committed fix.

 MoreLikeThis handler can't handle numeric id
 

 Key: SOLR-1344
 URL: https://issues.apache.org/jira/browse/SOLR-1344
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 MTL fails when uniqueKey is a numeric field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1319) Upgrade custom Solr Highlighter classes to new Lucene Highlighter API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740204#action_12740204
 ] 

Mark Miller commented on SOLR-1319:
---

We should probably be careful here in the future, and document anything thats 
based on code in Lucene without a backcompat policy to have similar looseness 
in Solr - or hide the Lucene implementation from the Solr public API's. 

 Upgrade custom Solr Highlighter classes to new Lucene Highlighter API
 -

 Key: SOLR-1319
 URL: https://issues.apache.org/jira/browse/SOLR-1319
 Project: Solr
  Issue Type: Task
  Components: highlighter
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 1.4




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-1337) Spans and Payloads Query Support

2009-08-06 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-1337:
-

Assignee: Grant Ingersoll

 Spans and Payloads Query Support
 

 Key: SOLR-1337
 URL: https://issues.apache.org/jira/browse/SOLR-1337
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 1.5


 It would be really nice to have query side support for: Spans and Payloads.  
 The main ingredient missing at this point is QueryParser support and a output 
 format for the spans and the payload spans.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-1343) HTMLStripCharFilter

2009-08-06 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1343:


Assignee: Koji Sekiguchi

I'll commit in a few days if there is no objections.

 HTMLStripCharFilter
 ---

 Key: SOLR-1343
 URL: https://issues.apache.org/jira/browse/SOLR-1343
 Project: Solr
  Issue Type: Improvement
  Components: Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Trivial
 Fix For: 1.4

 Attachments: SOLR-1343.patch


 Introducing HTMLStripCharFilter:
 * move html strip logic from HTMLStripReader to HTMLStripCharFilter
 * make HTMLStripReader depracated
 * make HTMLStrip*TokenizerFactory deprecated

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



DIH: Delta imports don't write last index time to property file

2009-08-06 Thread Jay Hill
We're using the DIH for delta imports, and we are monitoring on the
handlerName.properties file with some health check scripts to verify that
deltas are running. However we noticed that, if nothing has changed, no
update is made to the properties file.

I've verified that this is something within the Solr code where it is
deliberately not updating the property file if there were no documents
created or deleted. Here's a comment line from DocBuilder:

  // Do not commit unnecessarily if this is a delta-import and no
documents were created or deleted
In which case a finish method doesn't get called, and that is where the
persist method is called which writes out the date to the property file.

So it's clearly not a bug per se since this is intended. And I can see a
point to doing it that way. Deltas will still function correctly on
subsequent runs in that anything changed going forward will still be picked
up. However I also see the point where it is misleading to think that this
represents the last run of the delta import, because you can't rely on the
file to know if the delta actually ran.

The question is: Is this the correct approach? It seems to me that the last
index time should always be logged because it clearly marks when the delta
has been run.

I wanted to get some feedback before opening an issue in JIRA. So please
respond with any preferences to this behavior. My vote would be to change
this so the last index time is always recorded.

-Jay


[jira] Created: (SOLR-1345) Upgrade Lucene in prep for 2.9 release

2009-08-06 Thread Mark Miller (JIRA)
Upgrade Lucene in prep for 2.9 release
--

 Key: SOLR-1345
 URL: https://issues.apache.org/jira/browse/SOLR-1345
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
 Fix For: 1.4




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1345) Upgrade Lucene in prep for 2.9 release

2009-08-06 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-1345:
--

Attachment: SOLR-1345.patch

I think this is most of what is needed - there is a break in back compat with 
the Highlighter that needs to be addressed, but all tests pass (with the 
blocker issue applied)

Obviously requires new jars from Lucene trunk.

 Upgrade Lucene in prep for 2.9 release
 --

 Key: SOLR-1345
 URL: https://issues.apache.org/jira/browse/SOLR-1345
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
 Fix For: 1.4

 Attachments: SOLR-1345.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1342) CapitalizationFilterFactory uses incorrect length calculations

2009-08-06 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-1342.
---

Resolution: Fixed

Thanks a lot Robert!

 CapitalizationFilterFactory uses incorrect length calculations
 --

 Key: SOLR-1342
 URL: https://issues.apache.org/jira/browse/SOLR-1342
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Reporter: Robert Muir
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1342.patch


 CapitalizationFilterFactory in some cases uses termBuffer.length, which might 
 be larger than the actual termBufferLength()
 this causes keep words to be evaluated incorrectly, but with the LUCENE-1762 
 the bug is exposed, because the default buffer size has changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1346) undefined term in dismaxhandler doc

2009-08-06 Thread solrize (JIRA)
undefined term in dismaxhandler doc
---

 Key: SOLR-1346
 URL: https://issues.apache.org/jira/browse/SOLR-1346
 Project: Solr
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.3
 Environment: web site
Reporter: solrize
Priority: Minor


In http://wiki.apache.org/solr/DisMaxRequestHandler section pf (Phrase Fields)

It says: Once the list of matching documents has been identified using the fq 
and qf params, ...

The fq param is not described in the document.  I think it is supposed to say 
the q param (i.e. fq is an error) but it could be that fq is a parameter 
described in another document.  I'll check the other docs, but that is the 
situation then a cross-reference should be added to the dismax doc.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1345) Upgrade Lucene in prep for 2.9 release

2009-08-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740346#action_12740346
 ] 

Yonik Seeley commented on SOLR-1345:


fire when ready...

 Upgrade Lucene in prep for 2.9 release
 --

 Key: SOLR-1345
 URL: https://issues.apache.org/jira/browse/SOLR-1345
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
 Fix For: 1.4

 Attachments: SOLR-1345.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1345) Upgrade Lucene in prep for 2.9 release

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740381#action_12740381
 ] 

Mark Miller commented on SOLR-1345:
---

committed - r801872

 Upgrade Lucene in prep for 2.9 release
 --

 Key: SOLR-1345
 URL: https://issues.apache.org/jira/browse/SOLR-1345
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
 Fix For: 1.4

 Attachments: SOLR-1345.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1229) deletedPkQuery feature does not work when pk and uniqueKey field do not have the same value

2009-08-06 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740385#action_12740385
 ] 

Lance Norskog commented on SOLR-1229:
-

The Delta2 test handles both of the problems in this issue. Should this issue 
be closed?


 deletedPkQuery feature does not work when pk and uniqueKey field do not have 
 the same value
 ---

 Key: SOLR-1229
 URL: https://issues.apache.org/jira/browse/SOLR-1229
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 1.4

 Attachments: SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, 
 SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, tests.patch


 Problem doing a delta-import such that records marked as deleted in the 
 database are removed from Solr using deletedPkQuery.
 Here's a config I'm using against a mocked test database:
 {code:xml}
 dataConfig
  dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/db/
  document name=tests
entity name=test
pk=board_id
transformer=TemplateTransformer
deletedPkQuery=select board_id from boards where deleted = 'Y'
query=select * from boards where deleted = 'N'
deltaImportQuery=select * from boards where deleted = 'N'
deltaQuery=select * from boards where deleted = 'N'
preImportDeleteQuery=datasource:board
  field column=id template=board-${test.board_id}/
  field column=datasource template=board/
  field column=title /
/entity
  /document
 /dataConfig
 {code}
 Note that the uniqueKey in Solr is the id field.  And its value is a 
 template board-PK.
 I noticed the javadoc comments in DocBuilder#collectDelta it says Note: In 
 our definition, unique key of Solr document is the primary key of the top 
 level entity.  This of course isn't really an appropriate assumption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)

2009-08-06 Thread Avlesh Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740397#action_12740397
 ] 

Avlesh Singh commented on SOLR-247:
---

I haven't tested this patch yet.  But my belief is that the primary objective 
should be to support dynamic fields than pure wildcard field names. Dynamic 
fields offer wide range of capabilities with w.r.t key-value(s) kind of data. 
Most of the times people use such fields because the keys are not known upfront.

If nothing more, this patch should at least cater to that audience.

 Allow facet.field=* to facet on all fields (without knowing what they are)
 --

 Key: SOLR-247
 URL: https://issues.apache.org/jira/browse/SOLR-247
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Priority: Minor
 Attachments: SOLR-247-FacetAllFields.patch, SOLR-247.patch, 
 SOLR-247.patch, SOLR-247.patch


 I don't know if this is a good idea to include -- it is potentially a bad 
 idea to use it, but that can be ok.
 This came out of trying to use faceting for the LukeRequestHandler top term 
 collecting.
 http://www.nabble.com/Luke-request-handler-issue-tf3762155.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: DIH: Delta imports don't write last index time to property file

2009-08-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
I am fine with both options. I wish to hear from others too.

On Fri, Aug 7, 2009 at 4:50 AM, Jay Hilljayallenh...@gmail.com wrote:
 We're using the DIH for delta imports, and we are monitoring on the
 handlerName.properties file with some health check scripts to verify that
 deltas are running. However we noticed that, if nothing has changed, no
 update is made to the properties file.

 I've verified that this is something within the Solr code where it is
 deliberately not updating the property file if there were no documents
 created or deleted. Here's a comment line from DocBuilder:

      // Do not commit unnecessarily if this is a delta-import and no
 documents were created or deleted
 In which case a finish method doesn't get called, and that is where the
 persist method is called which writes out the date to the property file.

 So it's clearly not a bug per se since this is intended. And I can see a
 point to doing it that way. Deltas will still function correctly on
 subsequent runs in that anything changed going forward will still be picked
 up. However I also see the point where it is misleading to think that this
 represents the last run of the delta import, because you can't rely on the
 file to know if the delta actually ran.

 The question is: Is this the correct approach? It seems to me that the last
 index time should always be logged because it clearly marks when the delta
 has been run.

 I wanted to get some feedback before opening an issue in JIRA. So please
 respond with any preferences to this behavior. My vote would be to change
 this so the last index time is always recorded.

 -Jay




-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


[jira] Updated: (SOLR-1335) load core properties from a properties file

2009-08-06 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1335:
-

  Description: 
There are  few ways of loading properties in runtime,

# using env property using in the command line
# if you use a multicore drop it in the solr.xml

if not , the only way is to  keep separate solrconfig.xml for each instance.  
#1 is error prone if the user fails to start with the correct system property. 
In our case we have four different configurations for the same deployment  . 
And we have to disable replication of solrconfig.xml. 

It would be nice if I can distribute four properties file so that our ops can 
drop  the right one and start Solr. Or it is possible for the operations to 
edit a properties file  but it is risky to edit solrconfig.xml if he does not 
understand solr

I propose a properties file in the instancedir as solrcore.properties . If 
present would be loaded and added as core specific properties.




  was:
There are  few ways of loading properties in runtime,

# using env property using in the command line
# if you use a multicore drop it in the solr.xml

if not , the only way is to  keep separate solrconfig.xml for each instance.  
#1 is error prone if the user fails to start with the correct system property. 
In our case we have four different configurations for the same deployment  . 
And we have to disable replication of solrconfig.xml. The configurations are...

# main master
# slaves of main master
# repeater
# slaves of repeater

It would be nice if I can distribute four properties file so that our ops can 
drop  the right one and start Solr. 

I propose a properties file in the instancedir as solrcore.properties . If 
present would be loaded and added as core specific properties.




Fix Version/s: 1.4
 Assignee: Noble Paul

 load core properties from a properties file
 ---

 Key: SOLR-1335
 URL: https://issues.apache.org/jira/browse/SOLR-1335
 Project: Solr
  Issue Type: New Feature
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 1.4


 There are  few ways of loading properties in runtime,
 # using env property using in the command line
 # if you use a multicore drop it in the solr.xml
 if not , the only way is to  keep separate solrconfig.xml for each instance.  
 #1 is error prone if the user fails to start with the correct system 
 property. 
 In our case we have four different configurations for the same deployment  . 
 And we have to disable replication of solrconfig.xml. 
 It would be nice if I can distribute four properties file so that our ops can 
 drop  the right one and start Solr. Or it is possible for the operations to 
 edit a properties file  but it is risky to edit solrconfig.xml if he does not 
 understand solr
 I propose a properties file in the instancedir as solrcore.properties . If 
 present would be loaded and added as core specific properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1335) load core properties from a properties file

2009-08-06 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1335:
-

Attachment: SOLR-1335.patch

 load core properties from a properties file
 ---

 Key: SOLR-1335
 URL: https://issues.apache.org/jira/browse/SOLR-1335
 Project: Solr
  Issue Type: New Feature
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1335.patch


 There are  few ways of loading properties in runtime,
 # using env property using in the command line
 # if you use a multicore drop it in the solr.xml
 if not , the only way is to  keep separate solrconfig.xml for each instance.  
 #1 is error prone if the user fails to start with the correct system 
 property. 
 In our case we have four different configurations for the same deployment  . 
 And we have to disable replication of solrconfig.xml. 
 It would be nice if I can distribute four properties file so that our ops can 
 drop  the right one and start Solr. Or it is possible for the operations to 
 edit a properties file  but it is risky to edit solrconfig.xml if he does not 
 understand solr
 I propose a properties file in the instancedir as solrcore.properties . If 
 present would be loaded and added as core specific properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.