[jira] Created: (SOLR-275) PHP Serialized Response Writer

2007-06-26 Thread Nick Jenkin (JIRA)
PHP Serialized Response Writer
--

 Key: SOLR-275
 URL: https://issues.apache.org/jira/browse/SOLR-275
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: Nick Jenkin
Priority: Minor


A PHP response writer that returns a serialized array that can be used with the 
PHP function unserialize ( http://php.net/unserialize )

Built off the JSON Writer

I was not sure if this should be merged with 
https://issues.apache.org/jira/browse/SOLR-196

I have tried to keep code duplication very minimal, but always room for 
improvement!

Place PHPResponseWriter.java in src/org/apache/solr/request
Add the below to your solrconfig.xml:
queryResponseWriter name=php 
class=org.apache.solr.request.PHPResponseWriter/

Description of PHP serialization format: 
http://www.hurring.com/scott/code/perl/serialize/


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-275) PHP Serialized Response Writer

2007-06-26 Thread Nick Jenkin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Jenkin updated SOLR-275:
-

Attachment: PHPResponseWriter.java

 PHP Serialized Response Writer
 --

 Key: SOLR-275
 URL: https://issues.apache.org/jira/browse/SOLR-275
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: Nick Jenkin
Priority: Minor
 Attachments: PHPResponseWriter.java


 A PHP response writer that returns a serialized array that can be used with 
 the PHP function unserialize ( http://php.net/unserialize )
 Built off the JSON Writer
 I was not sure if this should be merged with 
 https://issues.apache.org/jira/browse/SOLR-196
 I have tried to keep code duplication very minimal, but always room for 
 improvement!
 Place PHPResponseWriter.java in src/org/apache/solr/request
 Add the below to your solrconfig.xml:
 queryResponseWriter name=php 
 class=org.apache.solr.request.PHPResponseWriter/
 Description of PHP serialization format: 
 http://www.hurring.com/scott/code/perl/serialize/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Solr nightly build failure

2007-06-26 Thread solr-dev

init-forrest-entities:
[mkdir] Created dir: /tmp/apache-solr-nightly/build

checkJunitPresence:

compile-common:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/common
[javac] Compiling 24 source files to /tmp/apache-solr-nightly/build/common
[javac] Note: 
/tmp/apache-solr-nightly/src/java/org/apache/solr/common/params/DisMaxParams.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/core
[javac] Compiling 193 source files to /tmp/apache-solr-nightly/build/core
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile-solrj-core:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/client/solrj
[javac] Compiling 21 source files to 
/tmp/apache-solr-nightly/build/client/solrj
[javac] Note: 
/tmp/apache-solr-nightly/client/java/solrj/src/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

compile-solrj:
[javac] Compiling 2 source files to 
/tmp/apache-solr-nightly/build/client/solrj
[javac] Note: 
/tmp/apache-solr-nightly/client/java/solrj/src/org/apache/solr/client/solrj/embedded/JettySolrRunner.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

compileTests:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/tests
[javac] Compiling 57 source files to /tmp/apache-solr-nightly/build/tests
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

junit:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results
[junit] Running org.apache.solr.BasicFunctionalityTest
[junit] Tests run: 24, Failures: 0, Errors: 0, Time elapsed: 17.496 sec
[junit] Running org.apache.solr.ConvertedLegacyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.223 sec
[junit] Running org.apache.solr.DisMaxRequestHandlerTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4.319 sec
[junit] Running org.apache.solr.EchoParamsTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.225 sec
[junit] Running org.apache.solr.OutputWriterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.458 sec
[junit] Running org.apache.solr.SampleTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.28 sec
[junit] Running org.apache.solr.analysis.TestBufferedTokenStream
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.053 sec
[junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.043 sec
[junit] Running org.apache.solr.analysis.TestKeepWordFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.047 sec
[junit] Running org.apache.solr.analysis.TestPatternReplaceFilter
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.054 sec
[junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.064 sec
[junit] Running org.apache.solr.analysis.TestPhoneticFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.063 sec
[junit] Running org.apache.solr.analysis.TestRemoveDuplicatesTokenFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.047 sec
[junit] Running org.apache.solr.analysis.TestSynonymFilter
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.071 sec
[junit] Running org.apache.solr.analysis.TestTrimFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.047 sec
[junit] Running org.apache.solr.analysis.TestWordDelimiterFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 3.449 sec
[junit] Running org.apache.solr.common.SolrDocumentTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.049 sec
[junit] Running org.apache.solr.common.params.SolrParamTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.057 sec
[junit] Running org.apache.solr.common.util.ContentStreamTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.143 sec
[junit] Running org.apache.solr.common.util.IteratorChainTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.044 sec

[jira] Commented: (SOLR-118) Some admin pages stop working with error 404 as the only symptom

2007-06-26 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508159
 ] 

Otis Gospodnetic commented on SOLR-118:
---

Scanning the past comments

My first guess would be that Jetty was left configured as is by default, which 
means it expanded webapps in /tmp/Jetty__some_dir_here  Some UNIX boxes are 
configured to purge old files from /tmp/ in order not tu run out of free disk 
space in /tmp partition.  Perhaps this is what happened here.

The fix is to configure Jetty to expand webapps to a dir that does not get 
purged.  Of course,  I cannot remember the property name for that.


 Some admin pages stop working with error 404 as the only symptom
 --

 Key: SOLR-118
 URL: https://issues.apache.org/jira/browse/SOLR-118
 Project: Solr
  Issue Type: Bug
  Components: web gui
 Environment: Fedora Core 4 (Linux version 2.6.11-1.1369_FC4smp)  
 Sun's JVM 1.5.0_07-b03
Reporter: Bertrand Delacretaz
Priority: Minor

 This was reported to the mailing list a while ago, see 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200610.mbox/[EMAIL 
 PROTECTED]
 Today I'm seeing the same thing on a Solr instance that has been running 
 since January 9th (about 13 days) with the plain start.jar setup. Index 
 contains 150'000 docs, 88322 search requests to date.
 $ curl http://localhost:8983/solr/admin/analysis.jsp
 html
 head
 titleError 404 /admin/analysis.jsp/title
 /head
 body
 h2HTTP ERROR: 404/h2pre/admin/analysis.jsp/pre
 pRequestURI=/solr/admin/analysis.jsp/p
 ...
 curl http://localhost:8983/solr/admin/index.jsp
 html
 head
 titleError 404 /admin/index.jsp/title
 /head
 body
 h2HTTP ERROR: 404/h2pre/admin/index.jsp/pre
 pRequestURI=/solr/admin/index.jsp/p
 ...
 Other admin pages work correctly, for example 
 http://localhost:8983/solr/admin/stats.jsp
 I don't see any messages in the logs, which are capturing stdout and stderr 
 from the JVM.
 I guess I'll have to restart this instance, I'm out of possibilities to find 
 out what's happening exactly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-267) log handler + query + hits

2007-06-26 Thread Will Johnson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508163
 ] 

Will Johnson commented on SOLR-267:
---

A few response rolled up:

Yonik Seeley commented on SOLR-267:
---


After having used this for a ~week now I kind of do too.  I can work on
a patch that switches that log component back unless someone else (who
wants it more) beats me to it.

hits.

Agreed, I'd love to have query pipelines and indexing pipelines for
processing logic but that's a much bigger effort.  At the moment it's
only 1 line extra in each of the 'real' query handlers which doesn't
seem too bad.


Ian Holsman commented on SOLR-267:
--

long? you might need/want to put in some quotes are the query.

It will look very long :)  As long as there are no spaces which the url
encoding should handle I think things are ok (this assumes we're going
to switch back to cgi params)

it in)

Not that I know how to do.  Since the dispatch filter is a filter not a
servlet it doesn't have access to an HttpServletResponse, only a
ServletResponse which means it can't set HttpHeaders.  This was my
original idea for how to solve this problem and seems a bit more
'standard' anyways but I hit a dead end without getting more hackish
than usual.

- will

 


 log handler + query + hits
 --

 Key: SOLR-267
 URL: https://issues.apache.org/jira/browse/SOLR-267
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.3
Reporter: Will Johnson
Priority: Minor
 Fix For: 1.3

 Attachments: LogQueryHitCounts.patch, LogQueryHitCounts.patch, 
 LogQueryHitCounts.patch, LogQueryHitCounts.patch, LogQueryHitCounts.patch


 adds a logger to log handler, query string and hit counts for each query

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-275) PHP Serialized Response Writer

2007-06-26 Thread Tristan Vittorio (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tristan Vittorio updated SOLR-275:
--

Attachment: PHPResponseWriter.java

Updated version of the original PHPResponseWriter.java patched to compile in 
the current svn trunk and fix a bug that caused corrupted serialized data when 
score was not included in the return fields list.

 PHP Serialized Response Writer
 --

 Key: SOLR-275
 URL: https://issues.apache.org/jira/browse/SOLR-275
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: Nick Jenkin
Priority: Minor
 Attachments: PHPResponseWriter.java, PHPResponseWriter.java


 A PHP response writer that returns a serialized array that can be used with 
 the PHP function unserialize ( http://php.net/unserialize )
 Built off the JSON Writer
 I was not sure if this should be merged with 
 https://issues.apache.org/jira/browse/SOLR-196
 I have tried to keep code duplication very minimal, but always room for 
 improvement!
 Place PHPResponseWriter.java in src/org/apache/solr/request
 Add the below to your solrconfig.xml:
 queryResponseWriter name=php 
 class=org.apache.solr.request.PHPResponseWriter/
 Description of PHP serialization format: 
 http://www.hurring.com/scott/code/perl/serialize/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-275) PHP Serialized Response Writer

2007-06-26 Thread Tristan Vittorio (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508178
 ] 

Tristan Vittorio edited comment on SOLR-275 at 6/26/07 6:59 AM:


Hi Nick,

Thanks for submitting your PHPResponseWriter code, it seems to work pretty 
well, however I needed to made a couple of minor changes to get it to compile 
with the current svn trunk:

27,28c27,28
 import org.apache.solr.util.NamedList;
 import org.apache.solr.util.SimpleOrderedMap;
---
 import org.apache.solr.common.util.NamedList;
 import org.apache.solr.common.util.SimpleOrderedMap;

The updated code I submitted also fixes a bug that caused the serialized data 
to be corrupt when score was not included in the return fields list:

162c162
 writer.write(a:4:{);
---
 writer.write(a:+(includeScore ? 4 : 3)+:{);

since if score was not included, the response array contained only three 
values rather than four.

Hopefully we can get a few more people testing this code thoroughly to make 
sure it works in all cases, since the PHP unserialize() function is very 
unforgiving on badly formatted data!

cheers,
Tristan


 was:
Updated version of the original PHPResponseWriter.java patched to compile in 
the current svn trunk and fix a bug that caused corrupted serialized data when 
score was not included in the return fields list.

 PHP Serialized Response Writer
 --

 Key: SOLR-275
 URL: https://issues.apache.org/jira/browse/SOLR-275
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: Nick Jenkin
Priority: Minor
 Attachments: PHPResponseWriter.java, PHPResponseWriter.java


 A PHP response writer that returns a serialized array that can be used with 
 the PHP function unserialize ( http://php.net/unserialize )
 Built off the JSON Writer
 I was not sure if this should be merged with 
 https://issues.apache.org/jira/browse/SOLR-196
 I have tried to keep code duplication very minimal, but always room for 
 improvement!
 Place PHPResponseWriter.java in src/org/apache/solr/request
 Add the below to your solrconfig.xml:
 queryResponseWriter name=php 
 class=org.apache.solr.request.PHPResponseWriter/
 Description of PHP serialization format: 
 http://www.hurring.com/scott/code/perl/serialize/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-118) Some admin pages stop working with error 404 as the only symptom

2007-06-26 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508253
 ] 

Bertrand Delacretaz commented on SOLR-118:
--

Otis, I guess we'll owe you a free beverage of his choice, your guess sounds 
totally right.

According to http://docs.codehaus.org/display/JETTY/Temporary+Directories the 
easiest fix might be to create a $(jetty.home)/work directory, which Jetty will 
use. I haven't checked if this works with the Jetty that's embedded with Solr.

 Some admin pages stop working with error 404 as the only symptom
 --

 Key: SOLR-118
 URL: https://issues.apache.org/jira/browse/SOLR-118
 Project: Solr
  Issue Type: Bug
  Components: web gui
 Environment: Fedora Core 4 (Linux version 2.6.11-1.1369_FC4smp)  
 Sun's JVM 1.5.0_07-b03
Reporter: Bertrand Delacretaz
Priority: Minor

 This was reported to the mailing list a while ago, see 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200610.mbox/[EMAIL 
 PROTECTED]
 Today I'm seeing the same thing on a Solr instance that has been running 
 since January 9th (about 13 days) with the plain start.jar setup. Index 
 contains 150'000 docs, 88322 search requests to date.
 $ curl http://localhost:8983/solr/admin/analysis.jsp
 html
 head
 titleError 404 /admin/analysis.jsp/title
 /head
 body
 h2HTTP ERROR: 404/h2pre/admin/analysis.jsp/pre
 pRequestURI=/solr/admin/analysis.jsp/p
 ...
 curl http://localhost:8983/solr/admin/index.jsp
 html
 head
 titleError 404 /admin/index.jsp/title
 /head
 body
 h2HTTP ERROR: 404/h2pre/admin/index.jsp/pre
 pRequestURI=/solr/admin/index.jsp/p
 ...
 Other admin pages work correctly, for example 
 http://localhost:8983/solr/admin/stats.jsp
 I don't see any messages in the logs, which are capturing stdout and stderr 
 from the JVM.
 I guess I'll have to restart this instance, I'm out of possibilities to find 
 out what's happening exactly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-276) XML vs JSON writer performance issues

2007-06-26 Thread Yonik Seeley (JIRA)
XML vs JSON writer performance issues
-

 Key: SOLR-276
 URL: https://issues.apache.org/jira/browse/SOLR-276
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
Assignee: Yonik Seeley
Priority: Minor


JSON writer seems slower than the XML writer
http://www.nabble.com/XML-vs-JSON-writer-performance-issues-tf3983443.html#a11309234

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-276) XML vs JSON writer performance issues

2007-06-26 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-276:
--

Attachment: json_writer.patch

patch to use a StringBuilder instead of adding directly to the writer.

 XML vs JSON writer performance issues
 -

 Key: SOLR-276
 URL: https://issues.apache.org/jira/browse/SOLR-276
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
Assignee: Yonik Seeley
Priority: Minor
 Attachments: json_writer.patch


 JSON writer seems slower than the XML writer
 http://www.nabble.com/XML-vs-JSON-writer-performance-issues-tf3983443.html#a11309234

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-275) PHP Serialized Response Writer

2007-06-26 Thread Nick Jenkin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508341
 ] 

Nick Jenkin commented on SOLR-275:
--

Thanks Tristan,
We do need more people testing as you said unserailize is very nasty when it 
comes to errors, 
One thing that does need more testing is escaping, and foreign characters, as I 
am not sure how PHP handles these. (Considering PHP is not UTF-8 yet)
-Nick

 PHP Serialized Response Writer
 --

 Key: SOLR-275
 URL: https://issues.apache.org/jira/browse/SOLR-275
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: Nick Jenkin
Priority: Minor
 Attachments: PHPResponseWriter.java, PHPResponseWriter.java


 A PHP response writer that returns a serialized array that can be used with 
 the PHP function unserialize ( http://php.net/unserialize )
 Built off the JSON Writer
 I was not sure if this should be merged with 
 https://issues.apache.org/jira/browse/SOLR-196
 I have tried to keep code duplication very minimal, but always room for 
 improvement!
 Place PHPResponseWriter.java in src/org/apache/solr/request
 Add the below to your solrconfig.xml:
 queryResponseWriter name=php 
 class=org.apache.solr.request.PHPResponseWriter/
 Description of PHP serialization format: 
 http://www.hurring.com/scott/code/perl/serialize/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-26 Thread Toru Matsuzawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508380
 ] 

Toru Matsuzawa commented on SOLR-255:
-

Hi Otis  Henri,

Otis Gospodnetic wrote:
 So with your patch one can search any *one* of those indices,
 or any *group* of those indices, 
 correct?  In the case where a *group* of indices is searched,
 do you search them in parallel and merge the results?

With my patch one can search a group of these indeces.
Each index in the group is searched in sequence, 
and then each result is merged.

Henri Biestro wrote:
 I've been looking quickly at your patch and 
 kinda understands why Otis is pushing for a merge. :-)
 I dont know how this is usually done; 
 should we merge the 2 issues and merge our patches?
 I can try  see how this goes if you want.

I inspected the patch of SOLR-215. 
The overlaps between SOLR-215 and SOLR255 are 
in the constructor of SolrIndexSearcher and SolrCore.
Each modification should be committed sequentially.
After that, there are not many additional modifications.

The commitment should be done through some stages. 
(It might be acceptable Step1 and Step2 is in reverse order. Or, simultaneous? 
) 
Step1) MultiCore (SOLR-215) 
Step2) The functionality of MultiSearcher, exclude modification of RMI and 
Lucene.
   (SolrMultiSearcher and SolrIndexSearchable) 
Step3) The modification of Lucene
Step4) The functional addition to the RMI (SolrRemoteSearcher) 
   (When it becomes MultiCore, additional modification, in which 
the remote object of RMI should be created dynamically, will be needed.)

 One thing that worries me though is the Lucene patch dependency; 
 any way to only have a Solr patch?
 I would suspect that Lucene committers are as busy as Solr 's 
 so the review process might take sometime.
 Although from far, it does look like pretty harmless changes so there is 
 hope...

The RMI (SolrRemoteSearcher) causes the Lucene patch dependency.
There will be no impact on SOLR-215 by the above-mentioned procedure.

 As a side note, I was wondering if we could extend 
 you patch's functionality and get read/write capability per index
 (as in http://hellonline.com/blog/?p=55 ,
 document indexing load balancing could be performed 
 on hashing unique key % number of indexes for instance 
 or by some configurable class). 
 The current functionality would be retained 
 by specificying 'read-only' versus 'read-write' for each index.

I also have ideas about this but those are not concrete enough.
Anyway, that will be done through Step5 and later.

Thanks.


 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508385
 ] 

Yonik Seeley commented on SOLR-255:
---

Toru, could you give an overview of how this solution works, what data is 
passed over the network, and how well you think it scales?  What is done on the 
shards, and what is done on the combiner to sort hits by a certain field?  to 
facet by a certain field?  Some of the methods on SolrSearchable make me 
nervous about scalability (like getInts(field), getFloats(field)) but it's not 
easy to tell if/when those are used.



 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508394
 ] 

Yonik Seeley commented on SOLR-255:
---

A few other considerations for good scalability:
 - shards should be searched in parallel
 - there needs to be multiple copies of each shard (esp for HA), with failover, 
otherwise splitting and distributing the index will lead to higher failure 
rates.


 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)
Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
-

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa


Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent

It is necessary to correspond with XmlUpdateRequestHandler because xpp3 cannot 
use !DOCTYPE.
I think it is necessary until StaxUpdateRequestHandler becomes /update.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toru Matsuzawa updated SOLR-277:


Attachment: XmlUpdateRequestHandler.patch

patch attached.

 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508408
 ] 

Walter Underwood commented on SOLR-277:
---

This is not a bug. Solr accepts XML, not XHTML. It does not accept XHTML-only 
entities. 

The Solr update XML format is a specific Solr XML format, not XML, not DocBook, 
not
anything else.

To index XHTML, parse it and convert it to Solr XML update format.


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508421
 ] 

Toru Matsuzawa commented on SOLR-277:
-

Hi Walter,
It is understood that it is not a bug. 
And, it is understood that the longevity of this patch is short.

I thought that you may support general entities, and gave this patch. 
Because it was thought that it was used easily more for the user. 

It seemed to follow the specification of xpp3. 
(Only Basic latin(quat; amp; lt; gt; apos;) is supported by current state 
xpp3.)

This issue closes if it is a specification that Solr XML format doesn't support 
Character Entities of XHTML. 

Thanks,


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-272) SolrDocument performance testing

2007-06-26 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-272:
--

Attachment: SolrInputDoc.patch

 With this test, the SolrInputDocument wins every time

Not once you correct the bugs ;-)

- copyField was not being done in the SolrInputDocument version
- setField was being used the for the multiValued field instead of addField, 
resulting in fewer fields.

I modified the schema (didn't work out of the box) and removed everything that 
didn't have to do with the fields in the document (partially because copyField 
wasn't implemented).

On my P4, SolrInputDocument comes in at 14% slower I don't know how it 
would be with all the copyField and dynamicField stuff in there.  There are 
certainly scenarios were it could be faster since it can do a single lookup for 
a multivalued field.



 SolrDocument performance testing
 

 Key: SOLR-272
 URL: https://issues.apache.org/jira/browse/SOLR-272
 Project: Solr
  Issue Type: Test
Affects Versions: 1.3
Reporter: Ryan McKinley
 Attachments: SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SolrDocumentPerformanceTester.java, SolrInputDoc.patch


 In 1.3, we added SolrInputDocument -- a temporary class to hold document 
 information.  There is concern that this may be less then ideal 
 performance-wise.
 To settle some concerns (mine included) I want to compare a few SolrDocument 
 implementations to make sure we are not doing something crazy.
 I implemented a LuceneInputDocument subclass of SolrInputDocument that stores 
 its values directly in Lucene Document (rather then a MapString,Collection).
 This is a quick test comparing:
 1. Building documents with SolrInputDocument 
 2. Building documents with LuceneInputDocument (same interface writing 
 directly to Document)
 3. using DocumentBuilder (solr 1.2, solr 1.1)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-272) SolrDocument performance testing

2007-06-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508426
 ] 

Yonik Seeley commented on SOLR-272:
---

Note that my current fix to toDocument() for copyField isn't complete since the 
previous implementation allowed copyField from an undefined field in the schema.

It might be cleaner just to use a field that isn't indexed or stored, but that 
would be a slight backward incompatability.
Might be OK since I don't know if anyone has ever used that feature.  Thoughts?


 SolrDocument performance testing
 

 Key: SOLR-272
 URL: https://issues.apache.org/jira/browse/SOLR-272
 Project: Solr
  Issue Type: Test
Affects Versions: 1.3
Reporter: Ryan McKinley
 Attachments: SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SolrDocumentPerformanceTester.java, SolrInputDoc.patch


 In 1.3, we added SolrInputDocument -- a temporary class to hold document 
 information.  There is concern that this may be less then ideal 
 performance-wise.
 To settle some concerns (mine included) I want to compare a few SolrDocument 
 implementations to make sure we are not doing something crazy.
 I implemented a LuceneInputDocument subclass of SolrInputDocument that stores 
 its values directly in Lucene Document (rather then a MapString,Collection).
 This is a quick test comparing:
 1. Building documents with SolrInputDocument 
 2. Building documents with LuceneInputDocument (same interface writing 
 directly to Document)
 3. using DocumentBuilder (solr 1.2, solr 1.1)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-272) SolrDocument performance testing

2007-06-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508429
 ] 

Yonik Seeley commented on SOLR-272:
---

Ugh... nevermind.
 I ran svn up on a different directory than what I patched, and hence got an 
older version.

 SolrDocument performance testing
 

 Key: SOLR-272
 URL: https://issues.apache.org/jira/browse/SOLR-272
 Project: Solr
  Issue Type: Test
Affects Versions: 1.3
Reporter: Ryan McKinley
 Attachments: SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SOLR-272-SolrDocumentPerformanceTesting.patch, 
 SolrDocumentPerformanceTester.java, SolrInputDoc.patch


 In 1.3, we added SolrInputDocument -- a temporary class to hold document 
 information.  There is concern that this may be less then ideal 
 performance-wise.
 To settle some concerns (mine included) I want to compare a few SolrDocument 
 implementations to make sure we are not doing something crazy.
 I implemented a LuceneInputDocument subclass of SolrInputDocument that stores 
 its values directly in Lucene Document (rather then a MapString,Collection).
 This is a quick test comparing:
 1. Building documents with SolrInputDocument 
 2. Building documents with LuceneInputDocument (same interface writing 
 directly to Document)
 3. using DocumentBuilder (solr 1.2, solr 1.1)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.