Re: Welcome Tim Potter as Lucene/Solr committer
Welcome Tim, On Tue, Apr 8, 2014 at 3:04 PM, Michael McCandless luc...@mikemccandless.com wrote: Welcome Tim! Mike McCandless http://blog.mikemccandless.com On Tue, Apr 8, 2014 at 12:40 AM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Tim Potter has accepted the PMC's invitation to become a committer. Tim, it's tradition that you introduce yourself with a brief bio. Once your account has been created - could take a few days - you'll be able to add yourself to the committers section of the Who We Are page on the website: http://lucene.apache.org/whoweare.html (use the ASF CMS bookmarklet at the bottom of the page here: https://cms.apache.org/#bookmark - more info here http://www.apache.org/dev/cms.html). Check out the ASF dev page - lots of useful links: http://www.apache.org/dev/. Congratulations and welcome! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
Re: Solr Ref Guide vs. Wiki
How do we plan to redirect? If I reached a specific page in the wiki after searching in Google and now tog forward me to the home page of cwiki ? Isn't it better to give links at the top of the page to relevant sections in cwiki (as much add possible) I'm +1 for locking the old wiki down On 7 Apr 2014 13:13, Toke Eskildsen t...@statsbiblioteket.dk wrote: On Mon, 2014-04-07 at 08:35 +0200, Shalin Shekhar Mangar wrote: On Mon, Apr 7, 2014 at 9:58 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: 5. Do something about JavaDocs polluting Google Index. At minimum, create /latest/ as a stable URL path and have it very Google visible. Make the rest of the versions in archive, non-crawlable. There is a lot more that can be done here, but probably not as part of this cleanup (see my older post about it) I am not sure if that is a big problem for Solr. How many people look at our javadocs? How many of us actually write them? Non-existing JavaDocs is a problem in itself, but even with the current state, I expect to be able to find the current JavaDocs (i.e. for the latest stable release) through a generic search on the Internet. If the result is a page with just the auto-generated stuff and no real explanations, then I at least know that I can stop searching. - Toke Eskildsen, State and University Library, Denmark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)
Solr/Lucene 4.8 - Java 7 +1 I'm not too sure about moving trunk to java 8 . Let's keep it at java 7 and make a call when we are closer to Lecene-Solr 5.0. Organizations move to newer versions of java very slowly. On Mon, Mar 10, 2014 at 6:30 PM, Uwe Schindler u...@thetaphi.de wrote: Hi Robert, the vote must be held open for 72 hours. I haven't even had a chance to formulate my VOTE+reasoning yet, and i dont agree with this crap here. Indeed, there is no need to hurry! I just wanted more discussions coming in. The merges I prepared already are stable and pass all tests, smokers,... So no problem to wait 2 more days, it is not urgent to commit my branch_4x checkout. As said in the thread already, I expected the reaction from our company-users/company-committers. I disagree, too, but it looks like more people are against this and that won't change anymore. I agree with you: trunk is our development branch, I see no problem with making it Java 8 only. From the other issue, we have no important news to actually release this as 5.0 soon, so we can for sure play with it for long time. To me it looks like some of our committers have forks off trunk they want to sell to their customers. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, March 10, 2014 1:34 PM To: dev@lucene.apache.org Subject: Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released) On Mon, Mar 10, 2014 at 5:46 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, it looks like we all agree on the same: +1 for Lucene 4.x requirement on Java 7. -1 to not change trunk (keep it on Java 7,too). I will keep this vote open until this evening, but I don't expect any other change. Indeed, there are no real technical reasons to not move. the vote must be held open for 72 hours. I haven't even had a chance to formulate my VOTE+reasoning yet, and i dont agree with this crap here. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
Re: [VOTE] Lucene / Solr 4.7.0 RC4
We should switch to not storing the scheme with base_url or how about completely relying on nodename (and eliminating base_url eventually) On Mon, Feb 24, 2014 at 2:25 AM, Mark Miller markrmil...@gmail.com wrote: No, it’s a different issue than urlScheme property should be whitelisted”. The issue is that you can't setup a cluster with http and then later switch it to https without some manual workaround steps. You can create a fresh cluster with SSL with no manual workaround steps though. - Mark http://about.me/markrmiller On Feb 23, 2014, at 2:04 PM, Simon Willnauer simon.willna...@gmail.com wrote: mark, I am a bit confused - the issue you are mentioning here is not fixed yet, right? It's not part of r1570798 | noble | 2014-02-22 07:50:02 +0100 (Sat, 22 Feb 2014) | 1 line SOLR-3854 urlScheme property should be whitelisted ? On Sat, Feb 22, 2014 at 10:44 PM, Mark Miller markrmil...@gmail.com wrote: I have simliar feelings to the white list issue (which looks like it did make it in). You can still use the feature, it’s a new feature and so no regression, and so I’d vote to document the limitation around migrating from http to https (you have to start with https without manual work) and address this in a 4.7.1 or 4.8. I do think its something we should address if a more serious issue causes a respin - it’s a straightforward fix - we should always be using the coreNodeName to match state - never the url or address. - Mark http://about.me/markrmiller On Feb 22, 2014, at 8:19 AM, Steve Davids sdav...@gmail.com wrote: Hate to bring this up, though it must have gotten lost in the shuffle. When migrating from http - https in SOLR-3854 shards aren’t obtaining their old core node name and resuming their prior assignments. This is because the base_url is being compared instead (which changed) instead of something more constant like the node_name. A patch was attached yesterday: https://issues.apache.org/jira/browse/SOLR-3854?focusedCommentId=13908014page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13908014. It is a quick patch that hasn’t really been tested yet, will do so later this evening. The current work-around would be that if clients want to migrate to https, they will need to shutdown their servers, migrate the cluster state’s base_url from http to https and bring the server back up. -Steve On Feb 22, 2014, at 5:46 AM, Simon Willnauer simon.willna...@gmail.com wrote: Please vote for the forth Release Candidate for Lucene/Solr 4.7.0 you can download it here: http://people.apache.org/~simonw/staging_area/lucene-solr-4.7.0-RC4-rev1570806/ or run the smoke tester directly with this commandline (don't forget to set JAVA6_HOME etc.): $ python3.2 -u dev-tools/scripts/smokeTestRelease.py http://people.apache.org/~simonw/staging_area/lucene-solr-4.7.0-RC4-rev1570806/ 1570806 4.7.0 /tmp/smoke_test_4_7 Smoketester said: SUCCESS! here is my +1 This RC includes the following fixes compared to RC3 r1570798 | noble | 2014-02-22 07:50:02 +0100 (Sat, 22 Feb 2014) | 1 line SOLR-3854 urlScheme property should be whitelisted r1570795 | noble | 2014-02-22 07:42:22 +0100 (Sat, 22 Feb 2014) | 1 line SOLR-5762 broke backward compatibility of Javabin format r1570772 | sarowe | 2014-02-22 01:49:11 +0100 (Sat, 22 Feb 2014) | 1 line Fix CHANGES.txt to reflect the twisted evolution and current state of the Admin UI Files conf directory File Browser. (merged branch_4x r1570771) r1570741 | sarowe | 2014-02-21 23:51:47 +0100 (Fri, 21 Feb 2014) | 1 line LUCENE-5465: Solr Contrib map-reduce breaks Manifest of all other JAR files by adding a broken Main-Class attribute (merged trunk r1570738) r1570628 | sarowe | 2014-02-21 17:38:59 +0100 (Fri, 21 Feb 2014) | 1 line SOLR-5729: intellij config (merge trunk r1570626) r1570576 | mvg | 2014-02-21 15:06:41 +0100 (Fri, 21 Feb 2014) | 2 lines Fixed typo in CHANGES.txt for issue LUCENE-5399 and moved that issue under optimizations. r1570562 | mikemccand | 2014-02-21 13:49:47 +0100 (Fri, 21 Feb 2014) | 1 line LUCENE-5461: fix thread hazard in ControlledRealTimeReopenThread causing a possible too-long wait time when a thread was waiting for a specific generation - To unsubscribe, e-mail:
RE: svn commit: r1570793 - in /lucene/dev/trunk/solr/solrj/src: java/org/apache/solr/client/solrj/request/JavaBinUpdateRequestCodec.java test-files/solrj/updateReq_4_5.bin test/org/apache/solr/client/
Sure. let's change the location one we identify the right place. I hope we don't need to block 4.7 for this On 22 Feb 2014 14:32, Uwe Schindler u...@thetaphi.de wrote: Hi, your commit is likely to fail when we start to restructure directory layouts. Tests should (whenever possible) always use the classpath to find the files in test-files (which is part of the classpath). Attached is a patch with the recommended way. I suggest to fix this. ExternalPaths.SOURCE_HOME is unused in Solr test, the constant is just a relict and only used internally on setup of test infrastructure. Real tests should not use it. There are more usages of this constant in test with absolute paths (including contrib/foobar/src/test-files to the contrib modules, I will open issue to fix those. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: no...@apache.org [mailto:no...@apache.org] Sent: Saturday, February 22, 2014 7:19 AM To: comm...@lucene.apache.org Subject: svn commit: r1570793 - in /lucene/dev/trunk/solr/solrj/src: java/org/apache/solr/client/solrj/request/JavaBinUpdateRequestCodec.java test-files/solrj/updateReq_4_5.bin test/org/apache/solr/client/solrj/request/TestUpdateRequestCodec.java Author: noble Date: Sat Feb 22 06:19:16 2014 New Revision: 1570793 URL: http://svn.apache.org/r1570793 Log: SOLR-5762 broke backward compatibility of Javabin format Added: lucene/dev/trunk/solr/solrj/src/test-files/solrj/updateReq_4_5.bin (with props) Modified: lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/request/Ja vaBinUpdateRequestCodec.java lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/request/T estUpdateRequestCodec.java Modified: lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/request/Ja vaBinUpdateRequestCodec.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/solrj/src/java/org/apa che/solr/client/solrj/request/JavaBinUpdateRequestCodec.java?rev=157079 3r1=1570792r2=1570793view=diff == --- lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/request/Ja vaBinUpdateRequestCodec.java (original) +++ lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/re +++ quest/JavaBinUpdateRequestCodec.java Sat Feb 22 06:19:16 2014 @@ -184,7 +184,13 @@ public class JavaBinUpdateRequestCodec { delByIdMap = (MapString,MapString,Object) namedList[0].get(delByIdMap); delByQ = (ListString) namedList[0].get(delByQ); doclist = (List) namedList[0].get(docs); -docMap = (ListEntrySolrInputDocument,MapObject,Object) namedList[0].get(docsMap); +Object docsMapObj = namedList[0].get(docsMap); + +if (docsMapObj instanceof Map) {//SOLR-5762 + docMap = new ArrayList(((Map)docsMapObj).entrySet()); +} else { + docMap = (ListEntrySolrInputDocument, MapObject, Object) docsMapObj; +} // we don't add any docs, because they were already processed Added: lucene/dev/trunk/solr/solrj/src/test-files/solrj/updateReq_4_5.bin URL: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/solrj/src/test- files/solrj/updateReq_4_5.bin?rev=1570793view=auto == Binary file - no diff available. Modified: lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/request/T estUpdateRequestCodec.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/solrj/src/test/org/apa che/solr/client/solrj/request/TestUpdateRequestCodec.java?rev=1570793r 1=1570792r2=1570793view=diff == --- lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/request/T estUpdateRequestCodec.java (original) +++ lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/re +++ quest/TestUpdateRequestCodec.java Sat Feb 22 06:19:16 2014 @@ -18,6 +18,9 @@ package org.apache.solr.client.solrj.req import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; +import java.io.File; +import java.io.FileInputStream; +import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; @@ -31,6 +34,7 @@ import junit.framework.Assert; import org.apache.lucene.util.LuceneTestCase; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.common.SolrInputField; +import org.apache.solr.util.ExternalPaths; import org.junit.Test; /** @@ -160,6 +164,75 @@ public class TestUpdateRequestCodec exte + public void testBackCompat4_5() throws IOException { + +UpdateRequest updateRequest = new UpdateRequest(); +
Re: [VOTE] Lucene / Solr 4.7.0 RC3
I'm working on a test case for 5762 I'll commit it tomorrow IST On 21 Feb 2014 20:05, Simon Willnauer simon.willna...@gmail.com wrote: So the problem here is where to draw the line. I think in a setup like we have with lucene and solr in one codebase the chance to hit a bug within these 72h is huge. This means the Release process is a huge pain each time. Then we have bugs that justify a respin and some who don't. I looked at SOLR-5762 and it seems this one should cause a respin but the LUCENE-5461 doesn't. It's hard to draw that line since its pretty much up to the RM and then you get heat if you draw that line. IMO we should improve our release process and release a point release every week shortening the vote period for that to maybe 24h. That way we can get stuff out quickly and don't spend weeks on the release process. I will call this vote here as failed and build a new RC once SOLR-5762 is in. simon On Fri, Feb 21, 2014 at 3:23 PM, Steve Rowe sar...@gmail.com wrote: I volunteer to be 4.7.1 RM. I’d prefer to delay the 4.7.0 release to include all known bugfixes, though. Simon, if you’re okay with it, I could take over as 4.7.0 RM and handle any respins. If not, it’s your prerogative to continue with the current RC vote; others can express their opinions by voting. I’m sure it’ll be fine either way. Steve On Feb 21, 2014, at 8:19 AM, Simon Willnauer simon.willna...@gmail.com wrote: Guys, I don't think we will ever get to the point where there is not a bug. But we have to draw a line here. If we respin I have to step back as the RM since I just can't spend more than 7 days on this. I think there should be a 4.7.1 at some point where you can get your bugs fixed as everybody else but we have to draw a line here. I think I am going to draw it here with the 3 +1 I am having. simon On Fri, Feb 21, 2014 at 2:12 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: Question here. Shouldn't SOLR-5762 be fixed before 4.7? My understanding is that if not, Solr 4.7 won't be able to work with SolrJ from 4.6.1 or earlier? On Fri, Feb 21, 2014 at 5:01 AM, Robert Muir rcm...@gmail.com wrote: And I think it should be under optimizations not changes in behavior. On Fri, Feb 21, 2014 at 6:31 AM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Only spotted a small docs typo in the Lucene CHANGES.txt, the second issue under Changes in Runtime Behavior should be LUCENE-5399 instead of LUCENE-4399. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene / Solr 4.7.0 RC3
Backward incompatibility is something that should be considered as a blocker. Anyway , I have fixed the issue in 4.7 branch. I have also fixed the SOLR-3854 omission You can respin a build On Sat, Feb 22, 2014 at 1:10 AM, Simon Willnauer simon.willna...@gmail.comwrote: Thanks nobel! can you please ping me here so I can kick off another RC. Regarding the bugs that should / should not block a release I think it's hard to say which one should and that is my biggest problem here. I think with more frequent releases and more point releases we can make the intervals shorter and bugs get fixed quicker. I think it's also the responsibility of the other committers to maybe go back a step and ask themself if a bug should block a release and only if the are absolutely +1 on respin mention it on the release thread? To me as a RM it's really hard to draw the line. I also think we should not push stuff into the release branch unless it's the cause of the respin, we should work towards a stable branch an every change makes it less stable again IMO. just my $0.05 On Fri, Feb 21, 2014 at 6:35 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: I'm working on a test case for 5762 I'll commit it tomorrow IST On 21 Feb 2014 20:05, Simon Willnauer simon.willna...@gmail.com wrote: So the problem here is where to draw the line. I think in a setup like we have with lucene and solr in one codebase the chance to hit a bug within these 72h is huge. This means the Release process is a huge pain each time. Then we have bugs that justify a respin and some who don't. I looked at SOLR-5762 and it seems this one should cause a respin but the LUCENE-5461 doesn't. It's hard to draw that line since its pretty much up to the RM and then you get heat if you draw that line. IMO we should improve our release process and release a point release every week shortening the vote period for that to maybe 24h. That way we can get stuff out quickly and don't spend weeks on the release process. I will call this vote here as failed and build a new RC once SOLR-5762 is in. simon On Fri, Feb 21, 2014 at 3:23 PM, Steve Rowe sar...@gmail.com wrote: I volunteer to be 4.7.1 RM. I’d prefer to delay the 4.7.0 release to include all known bugfixes, though. Simon, if you’re okay with it, I could take over as 4.7.0 RM and handle any respins. If not, it’s your prerogative to continue with the current RC vote; others can express their opinions by voting. I’m sure it’ll be fine either way. Steve On Feb 21, 2014, at 8:19 AM, Simon Willnauer simon.willna...@gmail.com wrote: Guys, I don't think we will ever get to the point where there is not a bug. But we have to draw a line here. If we respin I have to step back as the RM since I just can't spend more than 7 days on this. I think there should be a 4.7.1 at some point where you can get your bugs fixed as everybody else but we have to draw a line here. I think I am going to draw it here with the 3 +1 I am having. simon On Fri, Feb 21, 2014 at 2:12 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: Question here. Shouldn't SOLR-5762 be fixed before 4.7? My understanding is that if not, Solr 4.7 won't be able to work with SolrJ from 4.6.1 or earlier? On Fri, Feb 21, 2014 at 5:01 AM, Robert Muir rcm...@gmail.com wrote: And I think it should be under optimizations not changes in behavior. On Fri, Feb 21, 2014 at 6:31 AM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Only spotted a small docs typo in the Lucene CHANGES.txt, the second issue under Changes in Runtime Behavior should be LUCENE-5399 instead of LUCENE-4399. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
Re: [VOTE] Lucene / Solr 4.7.0 RC2
Yeah, -0 for a respin On Thu, Feb 20, 2014 at 11:58 PM, Simon Willnauer simon.willna...@gmail.com wrote: this makes sense mark - I will keep on building the RC3 On Thu, Feb 20, 2014 at 7:27 PM, Mark Miller markrmil...@gmail.com wrote: Nice last minute catch! Since we have not doc’d SSL support yet, a user could work around via just setting the scheme manually in zk. As it’s not a regression and there is a fairly simple workaround, I’d be -0 on a respin for it. SSL becomes possible for the few that need it and we can improve / fix it in a future release. - Mark http://about.me/markrmiller On Feb 20, 2014, at 1:13 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: SOLR-3854 seems to be incomplete . It relies on a CLuster-wide property called urlScheme. These properties needs to be whitelisted at the OverseerCollectionProcessor.KNOWN_CLUSTER_PROPS The testcase directly writes the property to ZK . A norma user would only use the API On Thu, Feb 20, 2014 at 10:55 PM, Simon Willnauer simon.willna...@gmail.com wrote: yeah we manually do svn merge per commit so at this stage only bugs should be ported. Feature wise we are set! thanks simon On Thu, Feb 20, 2014 at 6:23 PM, Benson Margulies bimargul...@gmail.com wrote: I get it. You're cherry-picking changes onto the rel branch. No, there's absolutely no reason to imagine grabbing 5449. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome on board Anshum, Looking forward to more exciting days --Noble On Mon, Feb 17, 2014 at 8:44 AM, Anshum Gupta ans...@anshumgupta.netwrote: Thanks Mark. I spent most of my life in New Delhi, India other than short stints in different parts of the country (including living in a beach house on a tropical island for 3 years when I was young). After spending the last 3 years in Bangalore, I just relocated to San Francisco to be at the LucidWorks office in the Bay Area. Prior to this I've been a part of the search teams at A9 (CloudSearch), Cleartrip.com and Naukri.com where I was involved in designing and developing search and recommendation engines. These days, I love contributing stuff to Solr, primarily around SolrCloud and hope to continue to be at least as active towards it. In my free time I love photography, traveling, eating out and drinking my beer. On Sun, Feb 16, 2014 at 2:33 PM, Mark Miller markrmil...@gmail.comwrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Anshum Gupta http://www.anshumgupta.net -- - Noble Paul
Re: The Old Git Discussion
I personally have -0 I don't have any preference. Minus because it would change my comfortable workflow. Is there any pressing need to switch? Or, if someone has to give one real reason for the switch, what would it be ? On 3 Jan 2014 20:49, Mark Miller markrmil...@gmail.com wrote: Just to answer some of your questions: On Jan 3, 2014, at 8:18 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I fully agree with Robert: I don't want to move to GIT. In addition, unless there is some tool that works as good as Windows' TortoiseSVN also for GIT, so I can merge in milliseconds(TM), I won't commit anything. SmartGit is way better than TortoiseSVN IMO. Your favorite tool is a silly way to decide something like this IMO as well though. I just note: I was working as committer for the PHP project (maintaining the SAPI module for Oracle iPlanet Webserver), but since they moved to GIT 2 years ago, I never contributed anything anymore. I just don't understand how it works and its completely unusable to me. E.g. look at this bullshit: https://wiki.php.net/vcs/gitworkflow - Sorry this is a no-go. And I have no idea what all these cryptic commands mean and I don't want to learn that. If we move to GIT, somebody else have to commit my patches. Others committed your patches in the past and I’m sure they will continue to do so in the future if you desire. And the other comment that was given here is not true: Merging with SVN works perfectly fine and is easy to do, unless you use the command line or Eclipse's bullshit SVN client (that never works correctly). With a good GUI (like the fantastic TortoiseSVN), merging is so simple and conflicts can be processed in milliseconds(TM). And it is much easier to understand. An opinion not commonly shared by my reading. At a minimum, simply opinion though. Also Subversion is an Apache Project and I want to add: We should eat our own dog food. Just to move to something with a crazy license and a broken user interface, just because it's cool, is a no-go to me. Certainly not because it’s cool! Who argued that? We would also need to rewrite all our checking tasks (like the check-svn-working-copy ANT task) to work with GIT. Is there a pure Java library that works for GIT? I assume: No. You assume wrong. JGit is used by many projects, I’ve used it myself. So this is another no-go for me. The checks we do cannot be done by command line. I guess it’s not a no go then, because your assumption was wrong… - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Collections API
If the patch is applied the work around must not be required On 29 Nov 2013 08:17, Steve Molloy smol...@opentext.com wrote: Thanks, I already had the genericCoreNodeNames=true in solrcloud section of solr.xml, new format. But I had a str entry instead of a bool, which apparently is simply treated as false. Anyhow, in my case the fix works if I move the bit setting the coreNodeName after the publish, not before. If it's before, I get a timeout error while it waits for a replica that is never set in waitForShardId. I'll both apply the modified patch and switch from str to bool. :) Thanks for the help, Steve From: Alexey Serba [ase...@gmail.com] Sent: November 28, 2013 2:10 AM To: dev@lucene.apache.org Subject: Re: Collections API https://issues.apache.org/jira/browse/SOLR-5510 I don't really understand all the details why is that happening, but the workaround is to add genericCoreNodeNames=${genericCoreNodeNames:true} attribute to cores element in your solr.xml file. On Tue, Nov 26, 2013 at 10:10 PM, Steve Molloy smol...@opentext.com wrote: I'm trying to reconcile our fork with 4.6 tag and I'm getting weird behaviour in Collections API, more specifically in ZkController's preRegister method after calling the create method of the collections API. When it checks if a slice has a replica for current node name, there is never any because at this stage, the slice has no replica. This is the new code that seems to be causing my issue, I can force the autoCreated to be always true to avoid the issue, but would like a cleaner way if there is one. if(cd.getCloudDescriptor().getCollectionName() !=null cd.getCloudDescriptor().getCoreNodeName() != null ) { //we were already registered if(zkStateReader.getClusterState().hasCollection(cd.getCloudDescriptor().getCollectionName())){ DocCollection coll = zkStateReader.getClusterState().getCollection(cd.getCloudDescriptor().getCollectionName()); if(!true.equals(coll.getStr(autoCreated))){ Slice slice = coll.getSlice(cd.getCloudDescriptor().getShardId()); if(slice != null){ == if(slice.getReplica(cd.getCloudDescriptor().getCoreNodeName()) == null) { log.info(core_removed This core is removed from ZK); throw new SolrException(ErrorCode.NOT_FOUND,coreNodeName + is removed); } } } } } Thanks. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 4.5.0
+1 On 19 Sep 2013 16:38, Marcelo Costa marcelojsco...@gmail.com wrote: +1 Marcelo Costa - http://www.infoq.com/br/author/Marcelo-Costa On Wed, Sep 18, 2013 at 7:41 PM, Jack Krupansky j...@basetechnology.comwrote: +1 for my quick test. Even tried the UI on IE10. -- Jack Krupansky -Original Message- From: Adrien Grand Sent: Wednesday, September 18, 2013 5:46 PM To: dev@lucene.apache.org Subject: [VOTE] Release Lucene/Solr 4.5.0 Hi all, Please test and vote to release the following Lucene and Solr 4.5.0 artifacts: http://people.apache.org/~**jpountz/staging_area/lucene-** solr-4.5.0-RC0-rev1524484/http://people.apache.org/~jpountz/staging_area/lucene-solr-4.5.0-RC0-rev1524484/ This vote is open until monday. Smoke tester passed and Elasticsearch tests ran successfully with these artifacts, so here is my +1. -- Adrien --**--**- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.**orgdev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org --**--**- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.**orgdev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 4.5.0
I see a commit message on branch_4x for the changes I made . Isn't it the same branch? On 19 Sep 2013 19:00, Adrien Grand jpou...@gmail.com wrote: Hi, On Thu, Sep 19, 2013 at 3:10 PM, Yonik Seeley yo...@lucidworks.com wrote: It looks like the last commit on SOLR-4221 didn't make it on the 45 branch. This is a change to a new API and hence needs to make it into 4.5 Thanks for noticing this issue, I will backport the commit. Hoss, do you want to take advantage of the fact that we need to respin to backport LUCENE-5223? Thanks. -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-4221) Custom sharding
Yes, It is resolved. On 12 Sep 2013 21:22, Chris Hostetter hossman_luc...@fucit.org wrote: : just grand me edit access Are you talking about edit access for the ref guide? there's a well documented process for that... https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation#Internal-MaintainingDocumentation-WhoCanEditThisDocumentation https://cwiki.apache.org/confluence/display/solr/Internal+-+CWIKI+ACLs That doesn't answer my first question however: is SOLR-4221 complete? should the issue be resolved as fixed in 4.5 ? : On Thu, Sep 12, 2013 at 5:11 AM, Hoss Man (JIRA) j...@apache.org wrote: : : : [ : https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764968#comment-13764968 ] : : Hoss Man commented on SOLR-4221: : : : Is this feature complete? (all of the subtasks are marked resolved and : several commits associated with this isue are in branch 4x) : : : Can someone who understands all of the various changes made in these : issues please update the ref guide (or post a comment suggestion what : additions should be made)... : : https://cwiki.apache.org/confluence/display/solr/Collections+API : : : : : Custom sharding : --- : : Key: SOLR-4221 : URL: https://issues.apache.org/jira/browse/SOLR-4221 : Project: Solr :Issue Type: New Feature : Reporter: Yonik Seeley : Assignee: Noble Paul : Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, : SOLR-4221.patch, SOLR-4221.patch : : : Features to let users control everything about sharding/routing. : : -- : This message is automatically generated by JIRA. : If you think it was sent incorrectly, please contact your JIRA : administrators : For more information on JIRA, see: http://www.atlassian.com/software/jira : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : : : : -- : - : Noble Paul : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-4221) Custom sharding
just grand me edit access On Thu, Sep 12, 2013 at 5:11 AM, Hoss Man (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764968#comment-13764968] Hoss Man commented on SOLR-4221: Is this feature complete? (all of the subtasks are marked resolved and several commits associated with this isue are in branch 4x) Can someone who understands all of the various changes made in these issues please update the ref guide (or post a comment suggestion what additions should be made)... https://cwiki.apache.org/confluence/display/solr/Collections+API Custom sharding --- Key: SOLR-4221 URL: https://issues.apache.org/jira/browse/SOLR-4221 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Assignee: Noble Paul Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch Features to let users control everything about sharding/routing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
Re: lazily-loaded cores and SolrCloud
It is worth doing it. A lot of people will need it . Let's open an issue and come up with a design for the same . On 14 Aug 2013 02:16, Yonik Seeley yo...@lucidworks.com wrote: At a high level, I think the idea is fine (and I've seen a number of people that wanted it). The question is more around one of implementation... would it make a mess of things or not. The answer to that I think is probably mostly related to issues around how zookeeper is currently handled. I don't see any issues with other things like spinning up a core when a request comes in for it. -Yonik http://lucidworks.com On Tue, Aug 13, 2013 at 4:26 PM, Erick Erickson erickerick...@gmail.com wrote: There was a question on the user's list today about making lazily-loaded (aka transient) cores work with SolrCloud where I basically punted and said not designed with that in mind. I've kind of avoided thinking about this as the use-case; the transient code wasn't written with SolrCloud in mind. But what is the general reaction to that pairing? Mostly I'm looking for feedback at the level of no way that could work without invasive changes to SolrCloud, don't even go there or sure, just allow ZK to get a list of all cores and it'll be fine, the user is responsible for the quirks though. Some questions that come to my mind: Is a core that's not loaded be considered live by ZK? Would simply returning a list of all cores (both loaded and not loaded) be sufficient for ZK? (this list is already available so the admin UI can list all cores). Does SolrCloud distributed update processing go through (or could be made to go through) the path that autoloads a core? Ditto for querying. I suspect the answer to both is that it'll just happen. Would the idea of waiting for all the cores to load on all the nodes for an update be totally unacceptable? We already have the distributed deadlock potential, this seems to make that more likely by lengthening out the time the semaphore in question is held. Would re-synching/leader election be an absolute nightmare? I can imagine that if all the cores for a particular shard weren't loaded at startup, there'd be a terrible time waiting for leader election for instance. Stuff I haven't thought of Mostly I'm trying to get a sense of the community here about whether supporting transient cores in SolrCloud mode would be something that would be easy/do-able/really_hard/totally_unacceptable. Thanks, Erick - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Threshold Checks for Replication in solrconfig.xml
What is the objective here? Now you can disable replication with a command on the master and re enable it later. Do you wish to make it a bit easier with this? the threshold check according to your example will delay the replication forever if the threshold is not reached at all This is only useful if you are doing a fresh reindex On Mon, Aug 5, 2013 at 9:44 AM, Kranti Parisa kranti.par...@gmail.comwrote: Hi, I think, it would be nice to configure Solr for the threshold checks before doing the index replication. This would stop a bad index to be copied over to the slaves which are ideally the ones serving the user requests. In our case, we will have Solr Indexer which will index the documents. Before starting the indexing process we disable the replication and then index the documents. Then perform the threshold checks and if we have a reasonable index then we enable the replication. So that the Solr Query Engines will have a good index to server the user queries. I have been thinking how it would be if we have this facility in Solr (solrconfig.xml) by default for everyone. We may have something like this inside the Replication Request Handler section (either master can check before enabling replciation or slave can check against the master before downloading the index, which ever is best, I think better master does this check so that all the slaves need not check for same thing against the master) lst name=thresholdchecks str query=id:[* TO *]10/str str query=id:[* TO *] AND type:movie4/str str query=id:[* TO *] AND type:music1/str /lst I think, this is a very common task for people using Solr replication. I am interested to work on this feature and commit the same. Before that I would like to know your views on this feature. If this is something already exists or coming up, please let me know! Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa -- - Noble Paul
Re: Threshold Checks for Replication in solrconfig.xml
I'm feeling , you are concerned about bad index. Don't you think we should be trying to avoid replicating bad index instead of thresholds? On Mon, Aug 5, 2013 at 11:55 AM, Kranti Parisa kranti.par...@gmail.comwrote: Yes, we can disable replication and perform the checks manually, that is what we are doing currently. And yes, the idea of configuring threshold checks is to delay the replication in case of bad index (if threshold checks are not needed, we can avoid configuring the same). It would give us control over a bad index especially in the cases of frequent deletes/updates for the expired assets. Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa On Mon, Aug 5, 2013 at 2:12 AM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: What is the objective here? Now you can disable replication with a command on the master and re enable it later. Do you wish to make it a bit easier with this? the threshold check according to your example will delay the replication forever if the threshold is not reached at all This is only useful if you are doing a fresh reindex On Mon, Aug 5, 2013 at 9:44 AM, Kranti Parisa kranti.par...@gmail.comwrote: Hi, I think, it would be nice to configure Solr for the threshold checks before doing the index replication. This would stop a bad index to be copied over to the slaves which are ideally the ones serving the user requests. In our case, we will have Solr Indexer which will index the documents. Before starting the indexing process we disable the replication and then index the documents. Then perform the threshold checks and if we have a reasonable index then we enable the replication. So that the Solr Query Engines will have a good index to server the user queries. I have been thinking how it would be if we have this facility in Solr (solrconfig.xml) by default for everyone. We may have something like this inside the Replication Request Handler section (either master can check before enabling replciation or slave can check against the master before downloading the index, which ever is best, I think better master does this check so that all the slaves need not check for same thing against the master) lst name=thresholdchecks str query=id:[* TO *]10/str str query=id:[* TO *] AND type:movie4/str str query=id:[* TO *] AND type:music1/str /lst I think, this is a very common task for people using Solr replication. I am interested to work on this feature and commit the same. Before that I would like to know your views on this feature. If this is something already exists or coming up, please let me know! Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa -- - Noble Paul -- - Noble Paul
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome cassandra. . On 1 Aug 2013 04:18, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
branch 4_x vs trunk
Are all the checkins made to 4_x branch available in the trunk too? I see some tests failing in trunk but 4_x is passing all tests -- - Noble Paul
Re: Anyone interested about using GPU to improve the performance of Lucene?
It does not really have to be a platform independent thing. It can be a configurable switch where the user who has a particular h/w should be able use that switch and take advantage of the perf boost. But, we should be able to demonstrate some significant improvement using NVIDIA GPUs On Wed, Jul 10, 2013 at 2:52 AM, Uwe Schindler u...@thetaphi.de wrote: Thanks for the information about the CUDA project! ** ** I think the main reason why you have not heard anything about Lucene/Solr/ElasticSearch working together with GPUs is mainly the fact, that Apache Lucene and all search servers on top of Lucene (Apache Solr, ElasticSearch) are pure Java applications, highly optimized to run in the Oracle virtual machine. Currently there is no official support for GPUs from Java APIs, you can only use proprietary wrapper libraries to make use of CUDA (e.g. http://www.jcuda.org/). ** ** It would be great, if there would be a platform independent way (directly in the official Java API) to execute jobs on GPUs. ** ** It might be worth a try to maybe implement the Lucene block codecs (the abstraction of the underlying posting list formats) using a GPU. Because this is encapsulated in a public API, it could be a separate project, using the JNI-based CUDA wrappers to encode/decode PFOR postings lists. The query execution logic is harder to port, because there is a lot of abstraction included (posting lists are doc-id iterators), which would need to be short circuited. ** ** Uwe ** ** - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de ** ** *From:* Yanning Li [mailto:yanni...@nvidia.com] *Sent:* Tuesday, July 09, 2013 11:02 PM *To:* dev@lucene.apache.org *Subject:* Anyone interested about using GPU to improve the performance of Lucene? ** ** Hi all, ** ** I work for NVIDIA Tesla Accelerating Computing Group. Recently we are noticed that GPU can really accelerate the performance of search engines. There are proof points not only from Google, but also from others, such as Yandex, Baidu, Bing, etc. But not much around Solr/Lucene. ** ** So we are trying to engage with Lucene developers more actively. **1) **If possible we could like to hear from your perspective, are there some opportunities for GPU in Lucene/Solr? **2) **Wondering is anyone interested to use GPU to accelerate the performance of Lucene/Solr? If so please feel free to let me know, we can send out free GPUs to get project started. ** ** Attached a paper talking about using GPU to accelerate index compression in case you have interests. ** ** Looking forward to hear from some of you, ** ** Best ** ** Yanning -- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -- -- - Noble Paul
Writing serialized java Object to ZK
OverseerCollectionProcessor.run() writes a serialized java Object to ZK. Is it by design? It makes it really hard to debug some ZK messages . I feel we should be consistent in what we write to ZK and it shold always be json -- - Noble Paul
Re: Writing serialized java Object to ZK
java serialized is not any more compact than json On Mon, Jun 24, 2013 at 10:02 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: My guess is that it is done to avoid running into the max payload limit of a ZK node? On Mon, Jun 24, 2013 at 9:59 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: OverseerCollectionProcessor.run() writes a serialized java Object to ZK. Is it by design? It makes it really hard to debug some ZK messages . I feel we should be consistent in what we write to ZK and it shold always be json -- - Noble Paul -- Regards, Shalin Shekhar Mangar. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Noble Paul
How to control the dataDir in SolrCloud
When I create a collection in SolrCloud, the dataDir is created implicitly under the instanceDir . I guess there is no way to configure this. It is essential for large real deployments to control where the data lives. -- - Noble Paul
Is solr.xml persistence completely broken in trunk?
I created a couple of cores from coreadmin and after I restarted the server , the cores are gone? I see that the persist attribute is deprecated . Is there a bug open already? -- - Noble Paul
Re: Solr - ORM like layer
SolrJ has a very limited ORM like functionality. It is built keeping in mind the the limitations of Solr schema. Please take a look at let us know what can we add more http://wiki.apache.org/solr/Solrj#Directly_adding_POJOs_to_Solr On Tue, Jun 4, 2013 at 6:22 PM, Tuğcem Oral tugcem.o...@gmail.com wrote: Hi folks, I wonder that there exist and ORM like layer for solr such that it generates the solr schema from given complex object type and index given list of corresponding objects. I wrote a simple module for that need in one of my projects and happyly ready to generalize it and contribute to solr, if there's not such a module exists or in progress. Thanks all. -- TO -- - Noble Paul
build fails
is it just me? BUILD FAILED C:\work\lucene_dev_fresh\build.xml:23: The following error occurred while execut ing this line: C:\work\lucene_dev_fresh\solr\build.xml:135: The following error occurred while executing this line: C:\work\lucene_dev_fresh\solr\core\build.xml:21: The following error occurred wh ile executing this line: C:\work\lucene_dev_fresh\solr\common-build.xml:67: C:\work\lucene_dev_fresh\solr \core\lib not found. -- - Noble Paul - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Push for a Solr 1.4.1 Bug Fix Release?
This has to be fixed as well https://issues.apache.org/jira/browse/SOLR-1https://issues.apache.org/jira/browse/SOLR-1934 769 On Mon, May 31, 2010 at 7:50 AM, Bill Au bill.w...@gmail.com wrote: +1 I can help test any RC too. Bill On Sun, May 30, 2010 at 7:03 PM, Koji Sekiguchi k...@r.email.ne.jpwrote: (10/05/30 14:08), Chris Hostetter wrote: FYI... : ## 9 Bugs w/fixes on the 1.5 branch that seem serious enough : ## that they warrant a 1.4.1 bug-fix release... ...those 9 bugs have been merged to branch-1.4. I'll work on the remainders listed below (which includes upgrading the lucene jars) tomorow or monday : https://issues.apache.org/jira/browse/SOLR-1522 : https://issues.apache.org/jira/browse/SOLR-1538 : https://issues.apache.org/jira/browse/SOLR-1558 : https://issues.apache.org/jira/browse/SOLR-1563 : https://issues.apache.org/jira/browse/SOLR-1579 : https://issues.apache.org/jira/browse/SOLR-1580 : https://issues.apache.org/jira/browse/SOLR-1582 : https://issues.apache.org/jira/browse/SOLR-1596 : https://issues.apache.org/jira/browse/SOLR-1651 https://issues.apache.org/jira/browse/SOLR-1934 -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org I'll backport the following soon if there is no objections: * SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers retrieved from ContentStreams are not closed in various places, resulting in file descriptor leaks. (Christoff Brill, Mark Miller) Koji -- http://www.rondhuit.com/en/ -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Vote on merging dev of Lucene and Solr
+1 On Thu, Mar 4, 2010 at 6:32 PM, Mark Miller markrmil...@gmail.com wrote: For those committers that don't follow the general mailing list, or follow it that closely, we are currently having a vote for committers: http://search.lucidimagination.com/search/document/4722d3144c2e3a8b/vote_merge_lucene_solr_development -- - Mark http://www.lucidimagination.com -- - Noble Paul | Systems Architect| AOL | http://aol.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Vote on merging dev of Lucene and Solr
+1 On Thu, Mar 4, 2010 at 6:32 PM, Mark Miller markrmil...@gmail.com wrote: For those committers that don't follow the general mailing list, or follow it that closely, we are currently having a vote for committers: http://search.lucidimagination.com/search/document/4722d3144c2e3a8b/vote_merge_lucene_solr_development -- - Mark http://www.lucidimagination.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1775) Replication of 300MB stops indexing for 5 seconds when syncing
This should be because of GC. Do you have autowarming enabled? On Sun, Feb 28, 2010 at 3:45 AM, Bill Bell (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839318#action_12839318 ] Bill Bell commented on SOLR-1775: - I agree it is i/o bound. But when we sync using Java replication, the slave STOPS taking requests for about 5 seconds. I.E. 1. The sync begins - initiated by the slave (the files are almost 1GB) 2. The slave is still taking requests 3. The slave completes the Sync 4. The requests to the slave STOPS for 5 seconds. 5. The slave continues taking requests I think the copy from one dir to another of a 1GB file is slowing down the machine - the i/o waits are like 50%. Is there a way to reduce the impact of the copy and switchover? Replication of 300MB stops indexing for 5 seconds when syncing -- Key: SOLR-1775 URL: https://issues.apache.org/jira/browse/SOLR-1775 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 1.4 Environment: Centos 5.3 Reporter: Bill Bell When using Java replication in v1.4 and doing a sync from master to slave, the slave delays for about 5-10 seconds. When using rsync this does not occur. Is there a way to thread better or lower the priority to not impact queries when it is bringing over the index files from the master? Maybe a separate process? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Solrj @Field getter annotation
On Mon, Feb 22, 2010 at 11:50 PM, Norman Wiechmann n.wiechm...@gmx.net wrote: Noble Paul നോബിള് नोब्ळ् wrote: if it is not provided with setters , what do you suggest? provide it with getters? yes, i would like to have the @Field annotation at getters. and if there is no reason against this i can provide a patch for DocumentObjectBinder that adds getter annotation and keeps the setter annotation for backward compatibility. I personally don't think it makes a lot of difference just because other JPA tools don't follow that pattern. So removing the annotation from the setter is not required best, Norman -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Solrj @Field getter annotation
if it is not provided with setters , what do you suggest? provide it with getters? On Mon, Feb 22, 2010 at 2:09 PM, Norman Wiechmann n.wiechm...@gmx.net wrote: Noble Paul നോബിള് नोब्ळ् schrieb: On Sun, Feb 21, 2010 at 11:04 PM, Norman Wiechmann n.wiechm...@gmx.net wrote: Hi, I'm wondering if there is a reason why the @Field annotation is restricted to setters? The preferred way is to apply the annotation to fields. it is provided on setters also if you prefer the set/get xxx() route. Ok, but fields are not my prefered way. I don't want to store all values required for the index into extra fields. As described in my first mail, I intend to use the annotation to generate index fields only, not to read them. So there is no need to have always a field. Sometimes I would like to create an solr annotated wrapper to the business object and sometimes I would like to annotate my business object directly. In my case I would like to index beans from java using the solrj client implementation. Transforming documents to beans is not required because I use queries to Solr from JavaScript only. To avoid the creation of setter methods just to use the @Field annotation I extended SolrServer to overwrite getBinder() and added an DocumentObjectBinder implementation that supports @Field annotations at bean property getter methods. For me it feels very unusual to add annotations to setters. It does not match with the experience I have from other libraries like JPA or JAXB. Best, Norman -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Solrj @Field getter annotation
On Sun, Feb 21, 2010 at 11:04 PM, Norman Wiechmann n.wiechm...@gmx.net wrote: Hi, I'm wondering if there is a reason why the @Field annotation is restricted to setters? The preferred way is to apply the annotation to fields. it is provided on setters also if you prefer the set/get xxx() route. In my case I would like to index beans from java using the solrj client implementation. Transforming documents to beans is not required because I use queries to Solr from JavaScript only. To avoid the creation of setter methods just to use the @Field annotation I extended SolrServer to overwrite getBinder() and added an DocumentObjectBinder implementation that supports @Field annotations at bean property getter methods. For me it feels very unusual to add annotations to setters. It does not match with the experience I have from other libraries like JPA or JAXB. Best, Norman -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Can we move FileFetcher out of SnapPuller?
On Sat, Feb 20, 2010 at 3:16 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Can we move FileFetcher out of SnapPuller? This will assist with reusing the replication handler for moving/copying cores. NO problem .raise an issue.. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: DIH $skipDoc issue
did the patch solve your problem? On Fri, Feb 5, 2010 at 12:17 AM, Gian Marco Tagliani gmtagli...@grupointercom.com wrote: Hi all, I'm using the version 1.4.0 of Solr and I'm having some trouble with the DIH when I use the special command $skipDoc. After skipping a document to insert, the next one is not inserted in the proper way. My DIH configuration is quite complex so I'll try to explain myself with a simpler example: item table: id name 1 aaa 2 bbb feature table: Item_id hidden 1 true 2 false DIH conf: document name=products entity name=item query=select * from item field column=ID name=id / field column=NAME name=name / entity name=feature query=select hidden from feature where item_id='${item.ID}' field name=$skipDoc column=hidden / /entity /entity /document The result I expected is that the record named bbb would be imported, but the result of my import case is that the other record (the aaa) has been inserted. I took a look to the DIH code and I found a possible problem that could cause this result. In the DocBuilder class when a $skipDoc is detected, an exception is raised. After handling the exception another loop starts, without cleaning up the doc variable. When the next record is read, the addFieldToDoc method can't fill the doc fields because they are already filled. To solve this problem I just clean up the doc variable when handling the exception. The patch with this tiny change is attached to this mail. Did anybody else encounter this problem? Is the change I did correct? Thanks Gian Marco Tagliani -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: DIH $skipDoc issue
yes. it should be . Could you raise an issue? On Fri, Feb 5, 2010 at 4:38 PM, Gian Marco Tagliani gmtagli...@grupointercom.com wrote: Hi, Yes the patch solved my problem. Do you think it could be useful for general use? Gian Marco -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: viernes, 05 de febrero de 2010 11:20 To: solr-dev@lucene.apache.org Subject: Re: DIH $skipDoc issue did the patch solve your problem? On Fri, Feb 5, 2010 at 12:17 AM, Gian Marco Tagliani gmtagli...@grupointercom.com wrote: Hi all, I'm using the version 1.4.0 of Solr and I'm having some trouble with the DIH when I use the special command $skipDoc. After skipping a document to insert, the next one is not inserted in the proper way. My DIH configuration is quite complex so I'll try to explain myself with a simpler example: item table: id name 1 aaa 2 bbb feature table: Item_id hidden 1 true 2 false DIH conf: document name=products entity name=item query=select * from item field column=ID name=id / field column=NAME name=name / entity name=feature query=select hidden from feature where item_id='${item.ID}' field name=$skipDoc column=hidden / /entity /entity /document The result I expected is that the record named bbb would be imported, but the result of my import case is that the other record (the aaa) has been inserted. I took a look to the DIH code and I found a possible problem that could cause this result. In the DocBuilder class when a $skipDoc is detected, an exception is raised. After handling the exception another loop starts, without cleaning up the doc variable. When the next record is read, the addFieldToDoc method can't fill the doc fields because they are already filled. To solve this problem I just clean up the doc variable when handling the exception. The patch with this tiny change is attached to this mail. Did anybody else encounter this problem? Is the change I did correct? Thanks Gian Marco Tagliani -- - Noble Paul | Systems Architect| AOL | http://aol.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: indexing a csv file with a multivalued field
you probably can write an UpdateProcessor to achieve this On Wed, Feb 3, 2010 at 9:33 AM, Seffie Schwartz yschw...@yahoo.com wrote: I am not having luck doing this. Even though I am specifying -F fieldname.separator='|' the fields are stored as one field not as multi fields. If I specify -F f.fieldname.separator='|' I get a null pointer exception; -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Why do we need SolrPluginUtils#optimizePreFetchDocs()
On Tue, Jan 5, 2010 at 4:52 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 5, 2010, at 1:56 AM, Noble Paul നോബിള് नोब्ळ् wrote: This looks like a hack. It currently only uses highlighter for prefetching docs and fields . There is no standard way of other components to take part in this. Possibly, but highlighting is one of the more expensive things to do and making sure the fields are there (and not lazily loaded) is important. Of course, it doesn't help if you want to use Term Vectors w/ highlighter We should either remove this altogether -1. or have a standard way for all components to take part in this. Perhaps a component could register what fields it needs? However, do you have a use case in mind? What component would you like to have leverage this? I don't know. But the point is can we have a an interface PrefetchAware (or anything nicer) and components can choose to return the list of fields which it is interested in prefetching. I would like to remove the Strong coupling of QueryComponent on highlighting. -Grant -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Why do we need SolrPluginUtils#optimizePreFetchDocs()
2010/1/5 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: On Tue, Jan 5, 2010 at 4:52 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 5, 2010, at 1:56 AM, Noble Paul നോബിള് नोब्ळ् wrote: This looks like a hack. It currently only uses highlighter for prefetching docs and fields . There is no standard way of other components to take part in this. Possibly, but highlighting is one of the more expensive things to do and making sure the fields are there (and not lazily loaded) is important. Of course, it doesn't help if you want to use Term Vectors w/ highlighter We should either remove this altogether -1. or have a standard way for all components to take part in this. Perhaps a component could register what fields it needs? However, do you have a use case in mind? What component would you like to have leverage this? I don't know. But the point is can we have a an interface PrefetchAware (or anything nicer) and components can choose to return the list of fields which it is interested in prefetching. I would like to remove the Strong coupling of QueryComponent on highlighting. Or we can add a method to ResponseBuilder.addPrefetchFields(String[] fieldNames) and SearchComponents can use this in prepare()/process() to express interest in prefetching. -Grant -- - Noble Paul | Systems Architect| AOL | http://aol.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Why do we need SolrPluginUtils#optimizePreFetchDocs()
ok I have opened a new issue https://issues.apache.org/jira/browse/SOLR-1702 On Tue, Jan 5, 2010 at 5:50 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 5, 2010, at 6:52 AM, Noble Paul നോബിള് नोब्ळ् wrote: On Tue, Jan 5, 2010 at 4:52 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 5, 2010, at 1:56 AM, Noble Paul നോബിള് नोब्ळ् wrote: This looks like a hack. It currently only uses highlighter for prefetching docs and fields . There is no standard way of other components to take part in this. Possibly, but highlighting is one of the more expensive things to do and making sure the fields are there (and not lazily loaded) is important. Of course, it doesn't help if you want to use Term Vectors w/ highlighter We should either remove this altogether -1. or have a standard way for all components to take part in this. Perhaps a component could register what fields it needs? However, do you have a use case in mind? What component would you like to have leverage this? I don't know. But the point is can we have a an interface PrefetchAware (or anything nicer) and components can choose to return the list of fields which it is interested in prefetching. I would like to remove the Strong coupling of QueryComponent on highlighting. Sounds reasonable to me. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: highlighting setting in solrconfig.xml
Koji has a point. the highlight syntax has to be deprecated . All the configurations can be but into the HighlightComponent. I shall open an issue. On Mon, Jan 4, 2010 at 9:44 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : If this design is ok, I need to introduce new sub tags like : fragListBuilder/ and fragmentsBuilder/ in highlighting/ : in solrconfig.xml. But now I wonder why the highligher settings : are in such very original place rather than searchComponent/. : : Do we have any reason to keep highlighting/ tag? : Or can we move it into HighlightComponent? I'm not all that aware of what all is involved in the existing highlighting/ config options, but i suspect it was introduced *before* search components, as a way to configure the highlighting utils that were used by multiple request handlers ... moving all of that into init/request params for the HighlightingComponent seems like a good idea to me -- but i wouldn't be suprised if there were some things that make sense to leave as independently initialized objects that are then refrenced by name, similar to the way QParserPlugins are initialized seperately from the QueryComponent and then refered to by name at request time. but as i said: i know very little about highlighting. -Hoss -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: highlighting setting in solrconfig.xml
new issue https://issues.apache.org/jira/browse/SOLR-1696 2010/1/4 noble.paul noble.p...@corp.aol.com: Koji has a point. the highlight syntax has to be deprecated . All the configurations can be but into the HighlightComponent. I shall open an issue. On Mon, Jan 4, 2010 at 9:44 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : If this design is ok, I need to introduce new sub tags like : fragListBuilder/ and fragmentsBuilder/ in highlighting/ : in solrconfig.xml. But now I wonder why the highligher settings : are in such very original place rather than searchComponent/. : : Do we have any reason to keep highlighting/ tag? : Or can we move it into HighlightComponent? I'm not all that aware of what all is involved in the existing highlighting/ config options, but i suspect it was introduced *before* search components, as a way to configure the highlighting utils that were used by multiple request handlers ... moving all of that into init/request params for the HighlightingComponent seems like a good idea to me -- but i wouldn't be suprised if there were some things that make sense to leave as independently initialized objects that are then refrenced by name, similar to the way QParserPlugins are initialized seperately from the QueryComponent and then refered to by name at request time. but as i said: i know very little about highlighting. -Hoss -- - Noble Paul | Systems Architect| AOL | http://aol.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Why do we need SolrPluginUtils#optimizePreFetchDocs()
This looks like a hack. It currently only uses highlighter for prefetching docs and fields . There is no standard way of other components to take part in this. We should either remove this altogether or have a standard way for all components to take part in this. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: [VOTE] SOLR-1602 Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there
I have not taken a look at the patch (could not apply it) . But I am in agreement with the idea of moving responsewriters to a new package. Just that the changes should be backward compatible On Wed, Dec 30, 2009 at 11:38 PM, Ryan McKinley ryan...@gmail.com wrote: Here's what you're voting on: [ x] Yes, move forward with SOLR-1602 with the plan proposed above [ ] No, don't move forward with SOLR-1602 because... I'll leave the vote open for 72 hours. Votes from SOLR committers are binding, but everyone is welcome to voice your opinion. Not to throw cold water on the formality... but.. when I suggested we get broader approval, i was not thinking about jumping into a formal vote... Seems odd to put a three day window while many people are on vacation :) ryan -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: ValueSourceParser problem
it does not have the code for SocialValueSource.. On Wed, Dec 16, 2009 at 12:18 PM, patrick o'leary pj...@pjaol.com wrote: Rather than subject the list to code, it's pasted here http://www.pasteyourcode.com/13969 On Tue, Dec 15, 2009 at 10:42 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Dec 16, 2009 at 11:58 AM, patrick o'leary pj...@pjaol.com wrote: SEVERE: java.lang.AbstractMethodError at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:439) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initValueSourceParsers(SolrCore.java:1469) at org.apache.solr.core.SolrCore.init(SolrCore.java:549) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) And svn info Path: . URL: http://svn.apache.org/repos/asf/lucene/solr/trunk Repository Root: http://svn.apache.org/repos/asf Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 891117 Node Kind: directory Schedule: normal Last Changed Author: koji Last Changed Rev: 890798 Last Changed Date: 2009-12-15 06:13:59 -0800 (Tue, 15 Dec 2009) I just wrote a custom ValueSourceParser which does not override the init method and it loads fine on current trunk. Can you share your code? -- Regards, Shalin Shekhar Mangar. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: [ANNOUNCEMENT] HttpComponents HttpClient 4.0.1 (GA) Released
I guess the async part is done by the client library itself. How does it help Solr? On Sun, Dec 13, 2009 at 5:55 PM, Uri Boness ubon...@gmail.com wrote: I think it would also be wise to have a look at Jetty's httpclient. I think its asynchronous nature can play nice for shard requests. see: http://wiki.eclipse.org/Jetty/Tutorial/HttpClient Cheers, Uri Grant Ingersoll wrote: In fact, see https://issues.apache.org/jira/browse/SOLR-1429 On Dec 11, 2009, at 1:05 PM, Grant Ingersoll wrote: There are, in fact, updates to many of Solr's dependencies that we should consider. I like the sound of 5-10% perf. improvement in HttpClient... -Grant On Dec 11, 2009, at 12:53 PM, Erik Hatcher wrote: FYI... Begin forwarded message: From: Oleg Kalnichevski ol...@apache.org Date: December 11, 2009 2:20:50 PM GMT+01:00 To: annou...@apache.org, priv...@hc.apache.org, d...@hc.apache.org, httpclient-us...@hc.apache.org Subject: [ANNOUNCEMENT] HttpComponents HttpClient 4.0.1 (GA) Released Reply-To: HttpComponents Project d...@hc.apache.org HttpClient 4.0.1 is a bug fix release that addresses a number of issues discovered since the previous stable release. None of the fixed bugs is considered critical. Most notably this release eliminates eliminates dependency on JCIP annotations. This release is also expected to improve performance by 5 to 10% due to elimination of unnecessary Log object lookups by short-lived components. --- Download - http://hc.apache.org/downloads.cgi Release notes - http://www.apache.org/dist/httpcomponents/httpclient/RELEASE_NOTES.txt HttpComponents site - http://hc.apache.org/ Please note HttpClient 4.0 currently provides only limited support for NTLM authentication. For details please refer to http://hc.apache.org/httpcomponents-client/ntlm.html --- About Apache HttpClient Although the java.net package provides basic functionality for accessing resources via HTTP, it doesn't provide the full flexibility or functionality needed by many applications. HttpClient seeks to fill this void by providing an efficient, up-to-date, and feature-rich package implementing the client side of the most recent HTTP standards and recommendations. Designed for extension while providing robust support for the base HTTP protocol, HttpClient may be of interest to anyone building HTTP-aware client applications such as web browsers, web service clients, or systems that leverage or extend the HTTP protocol for distributed communication. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: [ANNOUNCEMENT] HttpComponents HttpClient 4.0.1 (GA) Released
On Sun, Dec 13, 2009 at 10:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: 2009/12/13 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: I guess the async part is done by the client library itself. How does it help Solr? The only place that makes sense for async in solr is sending multiple requests as part of a distributed search (potentially hundreds) and not having to have a thread for each. But even in that case I'm not sure it's really a big deal - for each request, the work required of the system as a whole is much greater than the resources a thread takes up (and we use a thread pool to avoid creating/destroying threads all the time). The point is , the underlying http request is still synchronous ( correct me if I am wrong) . So there is some thread waiting somewhere either the httpclinet framework or the threadpool in solr. -Yonik http://www.lucidimagination.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: svn commit: r886127 - in /lucene/solr/trunk/src: solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java
fixed. Thanks Hoss On Fri, Dec 11, 2009 at 7:01 AM, Chris Hostetter hossman_luc...@fucit.org wrote: Noble: 1) you *have* to include a CHANGES.txt entry for every non-trivial commit ... if it has a Jira issue, there better be a CHANGES.txt entry, and the CHANGES.txt entry really needs to be in the same atomic commit as the rest of the changes, not a follow up commit, so code changes can be correlated to why the change was made. 2) CHANGES.txt entries must cite teh person who contributed the patch. 3) you have to be careful to cite the correct Jira issue when making commits -- this commit doesn't seem to have anything to do with SOLR-1516, i'm pretty sure it was for SOLR-1357 ...with out all three of these things, it's nearly impossible to audit changes later and understand what they were, and who they came from. : Date: Wed, 02 Dec 2009 11:57:17 - : From: no...@apache.org : Reply-To: solr-dev@lucene.apache.org : To: solr-comm...@lucene.apache.org : Subject: svn commit: r886127 - in /lucene/solr/trunk/src: : solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java : test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java : : Author: noble : Date: Wed Dec 2 11:57:15 2009 : New Revision: 886127 : : URL: http://svn.apache.org/viewvc?rev=886127view=rev : Log: : SOLR-1516 SolrInputDocument cannot process dynamic fields : : Modified: : lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java : lucene/solr/trunk/src/test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java : : Modified: lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java : URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java?rev=886127r1=886126r2=886127view=diff : == : --- lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java (original) : +++ lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/beans/DocumentObjectBinder.java Wed Dec 2 11:57:15 2009 : @@ -76,9 +76,19 @@ : } : : SolrInputDocument doc = new SolrInputDocument(); : - for( DocField field : fields ) { : - doc.setField( field.name, field.get( obj ), 1.0f ); : - } : + for (DocField field : fields) { : + if (field.dynamicFieldNamePatternMatcher != null : + field.get(obj) != null field.isContainedInMap) { : + MapString, Object mapValue = (HashMapString, Object) field : + .get(obj); : + : + for (Map.EntryString, Object e : mapValue.entrySet()) { : + doc.setField( e.getKey(), e.getValue(), 1.0f); : + } : + } else { : + doc.setField(field.name, field.get(obj), 1.0f); : + } : + } : return doc; : } : : : Modified: lucene/solr/trunk/src/test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java : URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java?rev=886127r1=886126r2=886127view=diff : == : --- lucene/solr/trunk/src/test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java (original) : +++ lucene/solr/trunk/src/test/org/apache/solr/client/solrj/beans/TestDocumentObjectBinder.java Wed Dec 2 11:57:15 2009 : @@ -25,12 +25,14 @@ : import org.apache.solr.common.SolrInputDocument; : import org.apache.solr.common.SolrInputField; : import org.apache.solr.common.SolrDocument; : +import org.apache.solr.common.util.Hash; : import org.apache.solr.common.util.NamedList; : import org.junit.Assert; : : import java.io.StringReader; : import java.util.Arrays; : import java.util.Date; : +import java.util.HashMap; : import java.util.List; : import java.util.Map; : : @@ -100,6 +102,15 @@ : item.inStock = false; : item.categories = new String[] { aaa, bbb, ccc }; : item.features = Arrays.asList( item.categories ); : + ListString supA = Arrays.asList( new String[] { supA1, supA2, supA3 } ); : + ListString supB = Arrays.asList( new String[] { supB1, supB2, supB3}); : + item.supplier = new HashMapString, ListString(); : + item.supplier.put(supplier_supA, supA); : + item.supplier.put(supplier_supB, supB); : + : + item.supplier_simple = new HashMapString, String(); : + item.supplier_simple.put(sup_simple_supA, supA_val); : + item.supplier_simple.put(sup_simple_supB, supB_val); : : DocumentObjectBinder binder = new DocumentObjectBinder(); : SolrInputDocument doc = binder.toSolrInputDocument( item
Re: [Solr Wiki] Update of DataImportHandler by DNaber
this need to be reverted . there was data loss On Wed, Dec 9, 2009 at 8:46 PM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The DataImportHandler page has been changed by DNaber. http://wiki.apache.org/solr/DataImportHandler?action=diffrev1=220rev2=221 -- dataConfig dataSource type=FileDataSource / document + entity name=f processor=FileListEntityProcessor baseDir=/some/path/tongle implicit field called 'plainText'. The content is not parsed in any way, however you may add transformers to manipulate the data within 'plainText' as needed or to create other additional fields. - entity name=f processor=FileListEntityProcessor baseDir=/some/path/to/files fileName=.*xml newerThan='NOW-3DAYS' recursive=true rootEntity=false dataSource=null - entity name=x processor=XPathEntityProcessor forEach=/the/record/xpath url=${f.fileAbsolutePath} - field column=full_name xpath=/field/xpath/ - /entity - /entity - /document - /dataConfig - }}} - Do not miss the `rootEntity` attribute. The implicit fields generated by the !FileListEntityProcessor are `fileAbsolutePath, fileSize, fileLastModified, fileName` and these are available for use within the entity X as shown above. It should be noted that !FileListEntityProcessor returns a list of pathnames and that the subsequent entity must use the !FileDataSource to fetch the files content. + example: - === CachedSqlEntityProcessor === - Anchor(cached) - - This is an extension of the !SqlEntityProcessor. This !EntityProcessor helps reduce the no: of DB queries executed by caching the rows. It does not help to use it in the root most entity because only one sql is run for the entity. - - Example 1. {{{ - entity name=x query=select * from x - entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor - /entity + entity processor=PlainTextEntityProcessor name=x url=http://abc.com/a.txt; dataSource=data-source-name + !-- copies the text to a field called 'text' in Solr-- + field column=plainText name=text/ - entity + /entity }}} - The usage is exactly same as the other one. When a query is run the results are stored and if the same query is run again it is fetched from the cache and returned + Ensure that the dataSource is of type !DataSourceReader (!FileDataSource, URL!DataSource) - Example 2: - {{{ - entity name=x query=select * from x - entity name=y query=select * from y processor=CachedSqlEntityProcessor where=xid=x.id - /entity - entity - }}} - - The difference with the previous one is the 'where' attribute. In this case the query fetches all the rows from the table and stores all the rows in the cache. The magic is in the 'where' value. The cache stores the values with the 'xid' value in 'y' as the key. The value for 'x.id' is evaluated every time the entity has to be run and the value is looked up in the cache an the rows are returned. - - In the where the lhs (the part before '=') is the column in y and the rhs (the part after '=') is the value to be computed for looking up the cache. - - === PlainTextEntityProcessor === + === LineEntityProcessor === - Anchor(plaintext) + Anchor(LineEntityProcessor) ! [[Solr1.4]] - This !EntityProcessor reads all content from the data source into an single implicit field called 'plainText'. The content is not parsed in any way, however you may add transformers to manipulate the data within 'plainText' as needed or to create other additional fields. + This !EntityProcessor reads all content from the data source on a line by line basis, a field called 'rawLine' is returned for each line read. The content is not parsed in any way, however you may add transformers to manipulate the data within 'rawLine' or to create other additional fields. + + The lines read can be filtered by two regular expressions '''acceptLineRegex''' and '''omitLineRegex'''. + This entities additional attributes are: + * '''`url`''' : a required attribute that specifies the location of the input file in a way that is compatible with the configured datasource. If this value is relative and you are using !FileDataSource or URL!DataSource, it assumed to be relative to '''baseLoc'''. + * '''`acceptLineRegex`''' :an optional attribute that if present discards any line which does not match the regExp. + * '''`omitLineRegex`''' : an optional attribute that is applied after any acceptLineRegex and discards any line which matches this regExp. + example: + {{{ + entity name=jc + processor=LineEntityProcessor + acceptLineRegex=^.*\.xml$ + omitLineRegex=/obsolete + url=file:///Volumes/ts/files.lis +
Unable to edit DIh wiki page
it is now impossible to edit the DIH wiki page becaus eofhttps://issues.apache.org/jira/browse/INFRA-2270 We have to be constantly vigilant of somebody editing that page because any editing results in loss of data. How do we go about with this? -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Solr Cell revamped as an UpdateProcessor?
I was refering to SOLR-1358. Anyway , SolrCell as an updateprocessor is a good idea On Tue, Dec 8, 2009 at 4:47 PM, Grant Ingersoll gsing...@apache.org wrote: On Dec 8, 2009, at 12:22 AM, Noble Paul നോബിള് नोब्ळ् wrote: Integrating Extraction w/ DIH is a better option. DIH makes it easier to do the mapping of fields etc. Which comment is this directed at? I'm lacking context here. On Tue, Dec 8, 2009 at 4:59 AM, Grant Ingersoll gsing...@apache.org wrote: On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote: ASs someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? It probably could, but am not sure how it works in a processor chain. However, I'm not sure I understand how they work all that much either. I also plan on adding, BTW, a SolrJ client for Tika that does the extraction on the client. In many cases, the ExtrReqHandler is really only designed for lighter weight extraction cases, as one would simply not want to send that much rich content over the wire. -- - Noble Paul | Systems Architect| AOL | http://aol.com -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: How to use dinamic field name in a function with Data import handler
it is supported and it should work. BTW which solr are you using? On Sat, Dec 5, 2009 at 10:43 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi! I’m trying to declare a dynamic field name in a function, but it doesn’t accept, I am doing something wrong or it isn’t possible? EXAMPLE: dataConfig script ![CDATA[ function relatedLevel(row) { row.put('related_'+row.get('SOURCE_NAME')+'_lvl', row.get(‘VALUE')); return row; } ]] /script dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@localhost:XX user=test password=test / document entity name=levelRelated transformer=script:relatedLevel query=SELECT SOURCE_NAME, VALUE FROM RELATED / /document /dataConfig SCHEMA: dynamicField name=related_* type=string indexed=true stored=true multiValued=true/ Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: How to use dinamic field name in a function with Data import handler
ok. so that is it. I guess I should try w/1.4 On Sat, Dec 5, 2009 at 11:01 PM, Renata Mota renata.m...@accurate.com.br wrote: I am using solr1.3 -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Saturday, December 05, 2009 3:23 PM To: solr-dev@lucene.apache.org Subject: Re: How to use dinamic field name in a function with Data import handler it is supported and it should work. BTW which solr are you using? On Sat, Dec 5, 2009 at 10:43 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi! I’m trying to declare a dynamic field name in a function, but it doesn’t accept, I am doing something wrong or it isn’t possible? EXAMPLE: dataConfig script ![CDATA[ function relatedLevel(row) { row.put('related_'+row.get('SOURCE_NAME')+'_lvl', row.get(‘VALUE')); return row; } ]] /script dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@localhost:XX user=test password=test / document entity name=levelRelated transformer=script:relatedLevel query=SELECT SOURCE_NAME, VALUE FROM RELATED / /document /dataConfig SCHEMA: dynamicField name=related_* type=string indexed=true stored=true multiValued=true/ Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Systems Architect| AOL | http://aol.com -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: TestContentStreamDataSource failing on trunk
SOLR-1600 ensures that for all mutivalued fields ,the response type is a collection.So your change is right On Thu, Nov 26, 2009 at 8:38 AM, Chris Hostetter hossman_luc...@fucit.org wrote: the lucene zone machine is down (see INFRA-2351) so i don't think nightly builds are running. On my local box i'm seeing TestContentStreamDataSource fail... junit.framework.AssertionFailedError: expected:Hello C1 but was:[Hello C1] at org.apache.solr.handler.dataimport.TestContentStreamDataSource.testSimple(TestContentStreamDataSource.java:67) Noble: is it possible this was caused by your SOLR-1600 changes? (which don't seem to be listed in CHANGES.txt ... what's up with that?!?) i think that [] syntax is ant's way of stringifying a collection. If i modify the test like this... - assertEquals(Hello C1, doc.getFieldValue(desc)); + assertEquals(Hello C1, doc.getFirstValue(desc)); ...it starts to pass, but i don't just wnat to commit that change without being sure we understand why it broke in the firstplace, and wether it's an indication of how something might break for end users. -Hoss -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Created: (SOLR-1592) Refactor XMLWriter startTag to allow arbitrary attributes to be written
Why don't we make the response writers deal w/ SolrDocument instead of lucene Document? That way we can get rid of a lot of ugly code. SOLR-1516 enables responsewriters to do that On Tue, Nov 24, 2009 at 6:44 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Nov 23, 2009 at 7:04 PM, Chris Hostetter hossman_luc...@fucit.org wrote: XMLWriter was originally created to be a wrapper arround a java.io.Writer that had convinent heper methods for generating a specific XML response format (ie: wt=xml) back before Solr even supported multiple output types. Indeed - my longer term plans always included getting rid of it and re-implementing as a subclass of TextResponseWriter and getting rid of the XMLWriter set of methods on FieldType: /** * Renders the specified field as XML */ public abstract void write(XMLWriter xmlWriter, String name, Fieldable f) throws IOException; /** * calls back to TextResponseWriter to write the field value */ public abstract void write(TextResponseWriter writer, String name, Fieldable f) throws IOException; ResponseWriters in general have always been very expert level... we change as we need to add new features. The specific implementations certainly not designed to be subclassed by users with any back compat guarantees. People were not even able to subclass XMLWriter in the past. http://search.lucidimagination.com/search/document/4d47d6248a54298d/custom_query_response_writer -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr 1.5 or 2.0?
option 3 looks best . But do we plan to remove anything we have not already marked as deprecated? On Thu, Nov 19, 2009 at 8:10 PM, Uwe Schindler u...@thetaphi.de wrote: We also had some (maybe helpful) opinions :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, November 19, 2009 3:31 PM To: java-dev@lucene.apache.org Subject: Re: Solr 1.5 or 2.0? Oops... of course I meant to post this in solr-dev. -Yonik http://www.lucidimagination.com On Wed, Nov 18, 2009 at 8:53 PM, Yonik Seeley yo...@lucidimagination.com wrote: What should the next version of Solr be? Options: - have a Solr 1.5 with a lucene 2.9.x - have a Solr 1.5 with a lucene 3.x, with weaker back compat given all of the removed lucene deprecations from 2.9-3.0 - have a Solr 2.0 with a lucene 3.x -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Noble Paul | Principal Engineer| AOL | http://aol.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Solr 1.5 or 2.0?
On Fri, Nov 20, 2009 at 6:30 AM, Ryan McKinley ryan...@gmail.com wrote: On Nov 19, 2009, at 3:34 PM, Mark Miller wrote: Ryan McKinley wrote: I would love to set goals that are ~3 months out so that we don't have another 1 year release cycle. For a 2.0 release where we could have more back-compatibly flexibility, i would love to see some work that may be too ambitious... In particular, the config spaghetti needs some attention. I don't see the need to increment solr to 2.0 for the lucene 3.0 change -- of course that needs to be noted, but incrementing the major number in solr only makes sense if we are going to change *solr* significantly. Lucene major numbers don't work that way, and I don't think Solr needs to work that way be default. I think major numbers are better for indicating backwards compat issues than major features with the way these projects work. Which is why Yonik mentions 1.5 with weaker back compat - its not just the fact that we are going to Lucene 3.x - its that Solr still relies on some of the API's that won't be around in 3.x - they are not all trivial to remove or to remove while preserving back compat. I confess I don't know the details of the changes that have not yet been integrated in solr -- the only lucene changes I am familiar with is what was required for solr 1.4. The lucene 2.x - 3.0 upgrade path seems independent of that to me. I would even argue that with solr 1.4 we have already required many lucene 3.0 changes -- All my custom lucene stuff had to be reworked to work with solr 1.4 (tokenizers multi-reader filters). Many - but certainly not all. Just my luck... I'm batting 1000 :) But that means my code can upgrade to 3.0 without a issue now! In general, I wonder where the solr back-compatibility contract applies (and to what degree). For solr, I would rank the importance as: #1 - the URL API syntax. Client query parameters should change as little as possible #2 - configuration #3 - java APIs Someone else would likely rank it differently - not everyone using Solr even uses HTTP with it. Someone heavily involved in custom plugins might care more about that than config. As a dev, I just plainly rank them all as important and treat them on a case by case basis. I think it is fair to suggest that people will have the most stable/consistent/seamless upgrade path if you stick to the HTTP API (and by extension most of the solrj API) I am not suggesting that the java APIs are not important and that back-compatibly is not important. Solr has a some APIs with a clear purpose, place, and intended use -- we need to take these very seriously. We also have lots of APIs that are half baked and loosy goosy. If a developer is working on the edges, i think it is fair to expect more hickups in the upgrade path. With that in mind, i think 'solr 1.5 with lucene 3.x' makes the most sense. Unless we see making serious changes to solr that would warrent a major release bump solr 1.5 with lucene 3.x is a good option. Solr 2.0 can have non-back compat changes for Solr itself. e.g removing the single core option , changing configuration, REST Api changes etc What is a serious change that would warrant a bump in your opinion? for example: - config overhaul. detangle the XML from the components. perhaps using spring. This is already done. No components read config from xml anymore SOLR-1198 - major URL request changes. perhaps we change things to be more RESTful -- perhaps let jersey take care of the URL/request building https://jersey.dev.java.net/ - perhaps OSGi support/control/configuration Lucene has an explict back-compatibility contract: http://wiki.apache.org/lucene-java/BackwardsCompatibility I don't know if solr has one... if we make one, I would like it to focus on the URL syntax+configuration Its not nice to give people plugins and then not worry about back compat for them :) i want to be nice. I just think that a different back compatibility contract applies for solr then for lucene. It seems reasonable to consider the HTTP API, configs, and java API independently. From my perspective, saying solr 1.5 uses lucene 3.0 implies everything a plugin developer using lucene APIs needs to know about the changes. To be clear, I am not against bumping to solr 2.0 -- I just have high aspirations (yet little time) for what a 2.0 bump could mean for solr. ryan - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Noble Paul | Principal Engineer| AOL | http://aol.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: svn commit: r881724 - in /lucene/solr/trunk/contrib/dataimporthandler/src: main/java/org/apache/solr/handler/dataimport/ test/java/org/apache/solr/handler/dataimport/
sure. I forgot On Wed, Nov 18, 2009 at 5:00 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Noble - how about a CHANGES.txt update for this? Thanks, Erik Begin forwarded message: From: no...@apache.org Date: November 18, 2009 12:22:18 PM GMT+01:00 To: solr-comm...@lucene.apache.org Subject: svn commit: r881724 - in /lucene/solr/trunk/contrib/dataimporthandler/src: main/java/org/apache/solr/handler/dataimport/ test/java/org/apache/solr/handler/dataimport/ Reply-To: solr-dev@lucene.apache.org Author: noble Date: Wed Nov 18 11:22:17 2009 New Revision: 881724 URL: http://svn.apache.org/viewvc?rev=881724view=rev Log: SOLR-1525 allow DIH to refer to core properties Modified: lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DocBuilder.java lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/VariableResolverImpl.java lucene/solr/trunk/contrib/dataimporthandler/src/test/java/org/apache/solr/handler/dataimport/TestVariableResolver.java Modified: lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DocBuilder.java URL: http://svn.apache.org/viewvc/lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DocBuilder.java?rev=881724r1=881723r2=881724view=diff == --- lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DocBuilder.java (original) +++ lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DocBuilder.java Wed Nov 18 11:22:17 2009 @@ -80,7 +80,9 @@ public VariableResolverImpl getVariableResolver() { try { - VariableResolverImpl resolver = new VariableResolverImpl(); + VariableResolverImpl resolver = null; + if(dataImporter != null dataImporter.getCore() != null) resolver = new VariableResolverImpl(dataImporter.getCore().getResourceLoader().getCoreProperties()); + else resolver = new VariableResolverImpl(); MapString, Object indexerNamespace = new HashMapString, Object(); if (persistedProperties.getProperty(LAST_INDEX_TIME) != null) { indexerNamespace.put(LAST_INDEX_TIME, persistedProperties.getProperty(LAST_INDEX_TIME)); Modified: lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/VariableResolverImpl.java URL: http://svn.apache.org/viewvc/lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/VariableResolverImpl.java?rev=881724r1=881723r2=881724view=diff == --- lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/VariableResolverImpl.java (original) +++ lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/VariableResolverImpl.java Wed Nov 18 11:22:17 2009 @@ -18,6 +18,7 @@ import java.util.HashMap; import java.util.Map; +import java.util.Collections; import java.util.regex.Pattern; /** @@ -41,7 +42,14 @@ private final TemplateString templateString = new TemplateString(); + private final Map defaults ; + public VariableResolverImpl() { + defaults = Collections.emptyMap(); + } + + public VariableResolverImpl(Map defaults) { + this.defaults = defaults; } /** @@ -100,23 +108,30 @@ for (int i = 0; i parts.length; i++) { String thePart = parts[i]; if (i == parts.length - 1) { - return namespace.get(thePart); + Object val = namespace.get(thePart); + return val == null ? getDefault(name): val ; } Object temp = namespace.get(thePart); if (temp == null) { - return namespace.get(mergeAll(parts, i)); + Object val = namespace.get(mergeAll(parts, i)); + return val == null ? getDefault(name): val ; } else { if (temp instanceof Map) { namespace = (Map) temp; } else { - return null; + return getDefault(name); } } } } finally { - CURRENT_VARIABLE_RESOLVER.set(null); + CURRENT_VARIABLE_RESOLVER.remove(); } - return null; + return getDefault(name); + } + + private Object getDefault(String name) { + Object val = defaults.get(name); + return val == null? System.getProperty(name) : val; } private String mergeAll(String[] parts, int i) { Modified: lucene/solr/trunk/contrib/dataimporthandler/src/test/java/org/apache/solr/handler/dataimport/TestVariableResolver.java URL: http://svn.apache.org/viewvc/lucene/solr/trunk/contrib/dataimporthandler/src/test/java/org/apache/solr/handler/dataimport/TestVariableResolver.java?rev=881724r1=881723r2=881724view=diff
Re: DataImportHandler Error (SEVERE: Ignoring Error when closing connection)
this should not cause any harm. it is just closing the connection at the end of the process. But the stacktrace could be avoided On Tue, Nov 10, 2009 at 2:39 AM, Pradeep Pujari prade...@rocketmail.com wrote: I am getting the following error in solr/trunk code while do a full-dataimport from db2 database. INFO: Read dataimport.properties Nov 9, 2009 1:02:55 PM org.apache.solr.handler.dataimport.SolrWriter persist INFO: Wrote last indexed time to dataimport.properties Nov 9, 2009 1:02:55 PM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:2.704 Nov 9, 2009 1:02:55 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.ibm.db2.jcc.a.SqlException: java.sql.Connection.close() requested while a transaction is in progress on the connection.The transaction remains active, and the connection cannot be closed. at com.ibm.db2.jcc.a.p.r(p.java:1385) at com.ibm.db2.jcc.a.p.u(p.java:1419) at com.ibm.db2.jcc.a.p.s(p.java:1395) at com.ibm.db2.jcc.a.p.close(p.java:1378) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:173) at org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:331) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:339) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389) at org.apache.solr.handler.dataimport.DataImporter$3.run(DataImporter.java:370) -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [VOTE] 1.4 new RC up
+1 On Mon, Nov 9, 2009 at 8:34 AM, Erik Hatcher erik.hatc...@gmail.com wrote: +1 On Nov 6, 2009, at 1:28 PM, Grant Ingersoll wrote: OK, done. Same place as always. Looks like the Lucene release is finally going through, so let's get this finished up! On Nov 5, 2009, at 6:55 PM, Grant Ingersoll wrote: Sure. I'll do it tonight or first thing tomorrow morning. On Nov 5, 2009, at 6:04 PM, Yonik Seeley wrote: On Thu, Nov 5, 2009 at 7:40 PM, Grant Ingersoll gsing...@apache.org wrote: Hopefully, 4th time is the charm: http://people.apache.org/~gsingers/solr/1.4.0/ Can we respin???!!! https://issues.apache.org/jira/browse/SOLR-1543 -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: SEVERE: SolrIndexWriter was not closed prior to finalize
On Fri, Nov 6, 2009 at 2:15 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Noble Paul നോബിള് नोब्ळ् schrieb: These logs were added to warn us developers on some missing cleanups. Doing cleanups in finalize() is not considered clean. It should not cause any harm other than those nasty messages. that's what I thought because I didn't experience any problems. But doesn't that indicate that there is a missing cleanup? cleanup stil happens. But we did not wish to to happen in finalize(). And if it is not worth checking / changing then maybe the log level should be decreased to adjust in comparison to other SEVERE warnings? It's just that monitoring systems will of course raise an alarm on SEVERE messages. And even if SOLR runs perfect these messages would give the impression that something's going wrong. This was a bug in Solr and there was a consensus to put it in as a SEVERE one. It is a code problem . I guess the message should be in WARNING level Cheers, Chantal On Fri, Nov 6, 2009 at 4:59 AM, markwaddle m...@markwaddle.com wrote: For what it's worth, I have encountered this error in the logs nearly once a day for the last 3 weeks. It appears so often, yet so inconsistently that it does not seem to occur while performing a specific operation or near a specific error. Mark Chantal Ackermann wrote: Hi all, just wanted to post this log output because it has 3 exclamation marks which makes it sound important. ;-) It has happened after an index process on one core was rolled back. The select request in between was issued on a different core. I have seen this message before but also only after some exception happened. I just reindexed successfully (no rollback) and no SEVERE reappeared. Otherwise everything works fine, so I suppose it's more a matter of log message placement / log level choice etc. Cheers, Chantal 05.11.2009 16:13:23 org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback 05.11.2009 16:13:23 org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback 05.11.2009 16:14:33 org.apache.solr.core.SolrCore execute INFO: [sei] webapp=/solr path=/dataimport params={} status=0 QTime=9 05.11.2009 16:43:38 org.apache.solr.core.SolrCore execute INFO: [epg] webapp=/solr path=/select params={sort=start_date+asc,start_date+ascstart=0q=%2Bstart_date:[*+TO+NOW]+%2Bend_date:[NOW+TO+*]+%2Bruntime:[5+TO+300]wt=javabinrows=20version=1} hits=8 status=0 QTime=297 05.11.2009 17:10:25 org.apache.solr.update.SolrIndexWriter finalize SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! -- View this message in context: http://old.nabble.com/SEVERE%3A-SolrIndexWriter-was-not-closed-prior-to-finalize-tp26217896p26224126.html Sent from the Solr - Dev mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Avro in Solr
Structured formats have a lot of limitations when it comes to solr. The number and name of fields in any document is completely arbitrary in Solr. Is it possible to represent such a datastructure in avro? On Wed, Nov 4, 2009 at 3:43 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hello, Avro is still young, from what I know, but I'm wondering if anyone has any thoughts on whether there is a place or need for Avro in Solr? http://www.cloudera.com/blog/2009/11/02/avro-a-format-for-big-data/ Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: release announcement draft
On Sun, Nov 1, 2009 at 7:35 PM, Yonik Seeley yo...@lucidimagination.com wrote: I'm also trying to reuse the first paragraph to come up with an update to our front page description to basically define Solr. I'll think about how I can fit in the cross-language aspects... Perhaps it deserves a second paragraph. One characterization of Solr I've heard in the past is that it's just basically a wrapper around Lucene - something I emphatically disagree with and trying to leave behind a bit. As Solr matures, it needs to stand more on it's own, rather than to define itself in comparison to Lucene or be easier than Lucene. +1 Solr should eventually come out of the shadows of Lucene . It should not be known as just a Lucene wrapper -Yonik On Sat, Oct 31, 2009 at 10:23 PM, Israel Ekpo israele...@gmail.com wrote: Your announcement looks great! However, would like to add just a little more text to the introduction part of Solr especially for people that may have heard about Lucene before but are hearing about Solr for the very first time. One of the reasons most developers are not involved with using Lucene for creating search applications is because of the one of the following factors: 1. From my perspective, it's a bit complicated to set up and use out of the box. It involves a fair amount of heavy lifting to make one's search application utilize most of the features the Java version of lucene has to offer. 2. If your are not using Java, most of the other ports of Lucene are usually behind in terms of the features offered by the Java version of Lucene. 3. In some programming languages such as ActionScript, PHP, Objective-C no reliable/effective lucene port is available. Now, thanks to Solr the language barrier excuse is gone, especially because of the ability to interact with the search server via HTTP and XML. Hence, via Solr you can take advantage of virtually all the features Lucene 2.9 has to offer and even more without any headache of implementing Lucene. The power of Web services should never be underestimated. Via, Solr developers around the world can now deploy the amazing features offered by Lucene 2.9 in virtually any programming language such as ActionScript, JavaScript, C, Visual Basic, Objective-C etc. Personally, the very first time I heard about Solr, the first impression I got was that it is just another port of Lucene or Java library based on Lucene and this is completely false. So I think it would be nice if you could include the http feature of Solr, so to speak, in the introduction section of your announcement just to clarify that it is not just another Java library based on Lucene. Again, this addition is targeted only towards individuals just hearing about Solr for the very first time. So I would suggest to add the following text hopefully without cluttering the presentation: --BEGIN-- Solr is not just another Java library based on Lucene. Nevertheless, powered by Lucene 2.9 internally, it is a standalone enterprise search server with a web-services-like API that allows one to index documents in XML or CSV format over HTTP. The contents of the index then be queried via HTTP and retrieved as an XML response, therefore making it seamlessly simplistic to deploy the amazing features offered by the enterprise search server in virtually any programming language such as ActionScript, JavaScript, C, Visual Basic, Objective-C etc. --END-- --OPTIONAL-- It's so easy even a caveman can use it! --OPTIONAL-- -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: release announcement draft
+1 On Mon, Nov 2, 2009 at 8:08 AM, Yonik Seeley yo...@lucidimagination.com wrote: OK, w/ grammar fixes from Israel (also checked into trunk): but I thought since the it was referring to Solr (singular) It's referring to the APIs (plural). -Yonik http://www.lucidimagination.com -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- Apache Solr 1.4 has been released and is now available for public download! http://www.apache.org/dyn/closer.cgi/lucene/solr/ Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required. New Solr 1.4 features include - Major performance enhancements in indexing, searching, and faceting - Revamped all-Java index replication that's simple to configure and can replicate config files - Greatly improved database integration via the DataImportHandler - Rich document processing (Word, PDF, HTML) via Apache Tika - Dynamic search results clustering via Carrot2 - Multi-select faceting (support for multiple items in a single category to be selected) - Many powerful query enhancements, including ranges over arbitrary functions, and nested queries of different syntaxes - Many other plugins including Terms for auto-suggest, Statistics, TermVectors, Deduplication -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: odd DIH $docBoost/multiValued=true issue
It is the behavior of the xml parser. if the field is multiValued, the value is always a ListString. in the field you may explicitly mention field column=$docBoost xpath=/doc/@boost multiValued=false/ this will override the schema setting On Wed, Oct 28, 2009 at 4:11 PM, Erik Hatcher erik.hatc...@gmail.com wrote: I've got a situation where I'm bringing in a document boost factor from some XML (which comes from another entity in the DIH pipeline). It maps in like this: field column=$docBoost xpath=/doc/@boost/ I got a parse exception (in DocBuilder where it parses $docBoost) because the value is an array. I do have a catch-all in my schema: dynamicField name=* type=string multiValued=true / When I set multiValued to false, it all worked fine. Thoughts? Thanks, Erik -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Dinamic field name with Data import handler
On Thu, Oct 22, 2009 at 7:51 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi, I’m trying to give dynamic names for a field with data import handler, but i don’t get. Example: entity name=users query = SELECT ID, NAME FROM USER field column=NAME name=name_'${ users. id}'_s / hey, this is supposed to work. is it because there is a space in the name attribute ? /entity It’s possible to do something like this? Thanks, Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [VOTE] Release Solr 1.4.0
I shall fix https://issues.apache.org/jira/browse/SOLR-1527 for 1.4 I guess On Wed, Oct 28, 2009 at 8:20 AM, Erik Hatcher erik.hatc...@gmail.com wrote: +1 also I'll have a look at the duplicate libs issue as soon as I can, but won't be until after/during next week (ApacheCon). Erik On Oct 27, 2009, at 10:41 PM, Chris Hostetter wrote: : OK, new artifacts are up. +1 And for the record, these are the artifacts my vote is based on... 8166f7f23637fa8a7d84c3cd30aa21ab apache-solr-1.4.0.tgz f7ffa8669e12271981c212733bee1ec0 apache-solr-1.4.0.zip -Hoss -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [VOTE] Release Solr 1.4.0
OK . Let us push it to 1.5. On Wed, Oct 28, 2009 at 10:01 AM, Ryan McKinley ryan...@gmail.com wrote: On Oct 28, 2009, at 12:07 AM, Chris Hostetter wrote: : It's not a regression, but a new, non-core feature. If we delay every : time we find a bug, this release will never end. agreed. agreed. And assuming lucene 3.0 comes out in the somewhat near future, we will have an easy place for minor bug fixes soon enough. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr nightly build failure
Aparently the replication handler tests are failing too frequently. Any idea why is it so? On Fri, Oct 23, 2009 at 2:12 PM, solr-dev@lucene.apache.org wrote: init-forrest-entities: [mkdir] Created dir: /tmp/apache-solr-nightly/build [mkdir] Created dir: /tmp/apache-solr-nightly/build/web compile-solrj: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj [javac] Compiling 87 source files to /tmp/apache-solr-nightly/build/solrj [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solr [javac] Compiling 389 source files to /tmp/apache-solr-nightly/build/solr [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compileTests: [mkdir] Created dir: /tmp/apache-solr-nightly/build/tests [javac] Compiling 177 source files to /tmp/apache-solr-nightly/build/tests [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. dist-contrib: init: [mkdir] Created dir: /tmp/apache-solr-nightly/contrib/clustering/build/classes [mkdir] Created dir: /tmp/apache-solr-nightly/contrib/clustering/lib/downloads [mkdir] Created dir: /tmp/apache-solr-nightly/build/docs/api init-forrest-entities: compile-solrj: compile: [javac] Compiling 1 source file to /tmp/apache-solr-nightly/build/solr [javac] Note: /tmp/apache-solr-nightly/src/java/org/apache/solr/search/DocSetHitCollector.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. make-manifest: [mkdir] Created dir: /tmp/apache-solr-nightly/build/META-INF proxy.setup: check-files: get-colt: [get] Getting: http://repo1.maven.org/maven2/colt/colt/1.2.0/colt-1.2.0.jar [get] To: /tmp/apache-solr-nightly/contrib/clustering/lib/downloads/colt-1.2.0.jar get-pcj: [get] Getting: http://repo1.maven.org/maven2/pcj/pcj/1.2/pcj-1.2.jar [get] To: /tmp/apache-solr-nightly/contrib/clustering/lib/downloads/pcj-1.2.jar get-nni: [get] Getting: http://download.carrot2.org/maven2/org/carrot2/nni/1.0.0/nni-1.0.0.jar [get] To: /tmp/apache-solr-nightly/contrib/clustering/lib/downloads/nni-1.0.0.jar get-simple-xml: [get] Getting: http://mirrors.ibiblio.org/pub/mirrors/maven2/org/simpleframework/simple-xml/1.7.3/simple-xml-1.7.3.jar [get] To: /tmp/apache-solr-nightly/contrib/clustering/lib/downloads/simple-xml-1.7.3.jar get-libraries: compile: [javac] Compiling 7 source files to /tmp/apache-solr-nightly/contrib/clustering/build/classes [javac] Note: /tmp/apache-solr-nightly/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/CarrotClusteringEngine.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. build: [jar] Building jar: /tmp/apache-solr-nightly/contrib/clustering/build/apache-solr-clustering-1.4-dev.jar dist: [copy] Copying 1 file to /tmp/apache-solr-nightly/dist init: [mkdir] Created dir: /tmp/apache-solr-nightly/contrib/dataimporthandler/target/classes init-forrest-entities: compile-solrj: compile: [javac] Compiling 1 source file to /tmp/apache-solr-nightly/build/solr [javac] Note: /tmp/apache-solr-nightly/src/java/org/apache/solr/search/DocSetHitCollector.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. make-manifest: compile: [javac] Compiling 43 source files to /tmp/apache-solr-nightly/contrib/dataimporthandler/target/classes [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compileExtras: [mkdir] Created dir: /tmp/apache-solr-nightly/contrib/dataimporthandler/target/extras/classes [javac] Compiling 1 source file to /tmp/apache-solr-nightly/contrib/dataimporthandler/target/extras/classes [javac] Note: /tmp/apache-solr-nightly/contrib/dataimporthandler/src/extras/main/java/org/apache/solr/handler/dataimport/MailEntityProcessor.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. build: [jar] Building jar:
Re: Dinamic field name with Data import handler
could you paste your dataconfig.xml? On Fri, Oct 23, 2009 at 5:48 PM, Renata Mota renata.m...@accurate.com.br wrote: Ok, thanks Noble Paul. But, now, i am trying to use ScriptTransformer (where, we must use Java6, and i am), but is happening the error below, I tried to use in different servers, both use java6 and same error: SEVERE: Exception while processing: index document : SolrInputDocumnt[{}] org.apache.solr.handler.dataimport.DataImportHandlerException: script can be used only in java 6 or above Processing Document # 1 at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransf ormer.java:89) at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTran sformer.java:50) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(Enti tyProcessorBase.java:186) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProce ssor.java:80) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java: 285) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178 ) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja va:334) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386 ) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransf ormer.java:83) ... 9 more Caused by: java.lang.NullPointerException at javax.script.ScriptEngineManager.getEngineByName(ScriptEngineManager.java:19 9) -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Friday, October 23, 2009 3:02 AM To: solr-dev@lucene.apache.org Subject: Re: Dinamic field name with Data import handler not like this. column and name cannot support templates. But you can use a Transformer to add new fields . you may write a simple javascript to do so. http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer On Thu, Oct 22, 2009 at 7:51 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi, I’m trying to give dynamic names for a field with data import handler, but i don’t get. Example: entity name=users query = SELECT ID, NAME FROM USER field column=NAME name=name_'${ users. id}'_s / /entity It’s possible to do something like this? Thanks, Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Dinamic field name with Data import handler
try putting the script tag directly under the dataConfig tag . it does not read it if it is under the document tag On Fri, Oct 23, 2009 at 6:18 PM, Renata Mota renata.m...@accurate.com.br wrote: I did, using the example: dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@127.0.0.1:1521:XY user=root/ document script![CDATA[ function f1(row) { row.put('id', 'Test'); return row; } ]]/script entity name=user pk=id transformer=script:f1 query = SELECT * FROM user /entity /document /dataConfig -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Friday, October 23, 2009 10:36 AM To: solr-dev@lucene.apache.org Subject: Re: Dinamic field name with Data import handler could you paste your dataconfig.xml? On Fri, Oct 23, 2009 at 5:48 PM, Renata Mota renata.m...@accurate.com.br wrote: Ok, thanks Noble Paul. But, now, i am trying to use ScriptTransformer (where, we must use Java6, and i am), but is happening the error below, I tried to use in different servers, both use java6 and same error: SEVERE: Exception while processing: index document : SolrInputDocumnt[{}] org.apache.solr.handler.dataimport.DataImportHandlerException: script can be used only in java 6 or above Processing Document # 1 at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransf ormer.java:89) at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTran sformer.java:50) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(Enti tyProcessorBase.java:186) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProce ssor.java:80) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java: 285) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178 ) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja va:334) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386 ) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransf ormer.java:83) ... 9 more Caused by: java.lang.NullPointerException at javax.script.ScriptEngineManager.getEngineByName(ScriptEngineManager.java:19 9) -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Friday, October 23, 2009 3:02 AM To: solr-dev@lucene.apache.org Subject: Re: Dinamic field name with Data import handler not like this. column and name cannot support templates. But you can use a Transformer to add new fields . you may write a simple javascript to do so. http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer On Thu, Oct 22, 2009 at 7:51 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi, I’m trying to give dynamic names for a field with data import handler, but i don’t get. Example: entity name=users query = SELECT ID, NAME FROM USER field column=NAME name=name_'${ users. id}'_s / /entity It’s possible to do something like this? Thanks, Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: patch for ExtractingDocumentLoader
Grant , is there any reason why the constructor can't be public? On Fri, Oct 23, 2009 at 8:07 PM, John Thorhauer jthorha...@yakabod.com wrote: Hi, I would like to extend the ExtractingRequestHandler and in doing so I will need to instantiate the ExtractingDocumentLoader in my own package. So could someone please apply this simple patch that would allow me to create an instance of the ExtractingDocumentLoader? Thanks for your help, John Thorhauer -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Dinamic field name with Data import handler
not like this. column and name cannot support templates. But you can use a Transformer to add new fields . you may write a simple javascript to do so. http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer On Thu, Oct 22, 2009 at 7:51 PM, Renata Mota renata.m...@accurate.com.br wrote: Hi, I’m trying to give dynamic names for a field with data import handler, but i don’t get. Example: entity name=users query = SELECT ID, NAME FROM USER field column=NAME name=name_'${ users. id}'_s / /entity It’s possible to do something like this? Thanks, Renata Gonçalves Mota mailto:renata.m...@accurate.com.br renata.m...@accurate.com.br Tel.: 55 11 3522-7723 R.3018 -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1513) Use Google Collections in ConcurrentLRUCache
On Tue, Oct 20, 2009 at 11:57 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Oct 20, 2009 at 3:56 PM, Mark Miller markrmil...@gmail.com wrote: On Oct 20, 2009, at 12:12 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think the debate is about weak reference vs. soft references. There appears to be confusion between the two here no matter what the debate - soft references are for cachinh, weak references are not so much. Getting it right is important. I guess the point that Lance is making is that using such a technique will make application performance less predictable. There's also a good chance that a soft reference based cache will cause cache thrashing and will hide OOMs caused by inadequate cache sizes. So basically we trade an OOM for more CPU usage (due to re-computation of results). That's the whole point. Your not hiding anything. I don't follow you. Using a soft reference based cache can hide the fact that one has inadequate memory for the cache size one has configured. Don't get me wrong. I'm not against the feature. I was merely trying to explain Lance's concerns as I understood them. Lance concern is valid. Assuming that we are going to have this feature (non-default) we need a way to know that cache trashing has happened.I mean the statistics should also expose the no:of cache entries which got removed. This should enable the user to decide whether there should be more RAM or he is happy to live w/ the extra cpu cycles for recomputation Personally, I think giving an option is fine. What if the user does not have enough RAM and he is willing to pay the price? Right now, there is no way he can do that at all. However, the most frequent reason behind OOMs is not having enough RAM to create the field caches and not Solr caches, so I'm not sure how important this is. How important is any feature? You don't have a use for it, so it's not important to you - someone else does so it is important to them. Soft value caches can be useful. Don't jump to conclusions :) The reason behind this feature request is to have Solr caches which resize themselves when enough memory is not available. I agree that soft value caches are useful for this. All I'm saying is that most OOMs that get reported on the list are due to inadequate free memory for allocating field caches. Finding a way around that will be the key to make a Lucene/Solr application practical in a limited memory environment. Just for the record, I'm +1 for adding this feature but keeping the current behavior as the default. -- Regards, Shalin Shekhar Mangar. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1513) Use Google Collections in ConcurrentLRUCache
On Wed, Oct 21, 2009 at 6:34 PM, Mark Miller markrmil...@gmail.com wrote: bq. and Mark is representing just keep working, ok?. But I'm not :) Like I said, I don't view the purpose of a soft value cache as avoiding OOM's. Size your caches correctly for that. For those that don't understand the how and why of soft value caches, they probably should not choose to use it. Users may not have a clue on how much memory eventually the caches will take up. Now if the admin page can let them know cache trashing has happened , they can think of adding more RAM Lance Norskog wrote: On-topic: Will the Google implementations + soft references behave well with 8+ processors? Semi-on-topic: If you want to really know multiprocessor algorithms, this is the bible: The Art Of Multiprocessor Programming. Hundreds of parallel algorithms for many different jobs, all coded in Java, and cross-referenced with the java.util.concurrent package. Just amazing. http://www.elsevier.com/wps/find/bookdescription.cws_home/714091/description#description Off-topic: I was representing a system troubleshooting philosophy: Fail Early, Fail Loud. Meaning, if there is a problem like OOMs, tell me and I'll fix it permanently. But different situations call for different answers, and Mark is representing just keep working, ok?. Brittle v.s. Supple is one way to think of it. On Tue, Oct 20, 2009 at 11:27 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Oct 20, 2009 at 3:56 PM, Mark Miller markrmil...@gmail.com wrote: On Oct 20, 2009, at 12:12 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think the debate is about weak reference vs. soft references. There appears to be confusion between the two here no matter what the debate - soft references are for cachinh, weak references are not so much. Getting it right is important. I guess the point that Lance is making is that using such a technique will make application performance less predictable. There's also a good chance that a soft reference based cache will cause cache thrashing and will hide OOMs caused by inadequate cache sizes. So basically we trade an OOM for more CPU usage (due to re-computation of results). That's the whole point. Your not hiding anything. I don't follow you. Using a soft reference based cache can hide the fact that one has inadequate memory for the cache size one has configured. Don't get me wrong. I'm not against the feature. I was merely trying to explain Lance's concerns as I understood them. Personally, I think giving an option is fine. What if the user does not have enough RAM and he is willing to pay the price? Right now, there is no way he can do that at all. However, the most frequent reason behind OOMs is not having enough RAM to create the field caches and not Solr caches, so I'm not sure how important this is. How important is any feature? You don't have a use for it, so it's not important to you - someone else does so it is important to them. Soft value caches can be useful. Don't jump to conclusions :) The reason behind this feature request is to have Solr caches which resize themselves when enough memory is not available. I agree that soft value caches are useful for this. All I'm saying is that most OOMs that get reported on the list are due to inadequate free memory for allocating field caches. Finding a way around that will be the key to make a Lucene/Solr application practical in a limited memory environment. Just for the record, I'm +1 for adding this feature but keeping the current behavior as the default. -- Regards, Shalin Shekhar Mangar. -- - Mark http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1513) Use Google Collections in ConcurrentLRUCache
On Tue, Oct 20, 2009 at 6:07 PM, Mark Miller markrmil...@gmail.com wrote: I'm +1 obviously ;) No one is talking about making it the default. And I think its well known that soft value caches can be a valid choice - thats why google has one in their collections here ;) Its a nice way to let your cache grow and shrink based on the available RAM. Its not always the right choice, but sure is a nice option. And it doesn't have much to do with Lucene's FieldCaches. The main reason for a soft value cache is not to avoid OOM. Set your cache sizes correctly for that. And even if it was to avoid OOM, who cares if something else causes more of them? Thats like not fixing a bug in a piece of code because another piece of code has more bugs. Anyway, their purpose is to allow the cache to size depending on the available free RAM IMO. +1 Noble Paul നോബിള് नोब्ळ् wrote: So , is everyone now in favor of this feature? Who has a -1 on this? and what is the concern? On Tue, Oct 20, 2009 at 3:56 PM, Mark Miller markrmil...@gmail.com wrote: On Oct 20, 2009, at 12:12 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think the debate is about weak reference vs. soft references. There appears to be confusion between the two here no matter what the debate - soft references are for cachinh, weak references are not so much. Getting it right is important. I guess the point that Lance is making is that using such a technique will make application performance less predictable. There's also a good chance that a soft reference based cache will cause cache thrashing and will hide OOMs caused by inadequate cache sizes. So basically we trade an OOM for more CPU usage (due to re-computation of results). That's the whole point. Your not hiding anything. I don't follow you. Personally, I think giving an option is fine. What if the user does not have enough RAM and he is willing to pay the price? Right now, there is no way he can do that at all. However, the most frequent reason behind OOMs is not having enough RAM to create the field caches and not Solr caches, so I'm not sure how important this is. How important is any feature? You don't have a use for it, so it's not important to you - someone else does so it is important to them. Soft value caches can be useful. On Tue, Oct 20, 2009 at 8:41 AM, Mark Miller markrmil...@gmail.com wrote: There is a difference - weak references are not for very good for caches - soft references (soft values here) are good for caches in most jvms. They can be very nice. Weak refs are eagerly reclaimed - it's suggested that impls should not eagerly reclaim soft refs. - Mark http://www.lucidimagination.com (mobile) On Oct 19, 2009, at 8:22 PM, Lance Norskog goks...@gmail.com wrote: Soft references then. Weak pointers is an older term. (They're weak because some bully can steal their candy.) On Sun, Oct 18, 2009 at 8:37 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Lance, Do you mean soft references? On Sun, Oct 18, 2009 at 3:59 PM, Lance Norskog goks...@gmail.com wrote: -1 for weak references in caching. This makes memory management less deterministic (predictable) and at peak can cause cache-thrashing. In other words, the worst case gets even more worse. When designing a system I want predictability and I want to control the worst case, because system meltdowns are caused by the worst case. Having thousands of small weak references does the opposite. On Sat, Oct 17, 2009 at 2:00 AM, Noble Paul (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766864#action_12766864 ] Noble Paul commented on SOLR-1513: -- bq.Google Collections is already checked in as a dependency of Carrot clustering. in that e need to move it to core. Jason . We do not need to remove the original option. We can probably add an extra parameter say softRef=true or something. That way , we are not screwing up anything and perf benefits can be studied separately. Use Google Collections in ConcurrentLRUCache Key: SOLR-1513 URL: https://issues.apache.org/jira/browse/SOLR-1513 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: google-collect-snapshot.jar, SOLR-1513.patch ConcurrentHashMap is used in ConcurrentLRUCache. The Google Colletions concurrent map implementation allows for soft values that are great for caches that potentially exceed the allocated
Re: [jira] Commented: (SOLR-1513) Use Google Collections in ConcurrentLRUCache
On Mon, Oct 19, 2009 at 4:29 AM, Lance Norskog goks...@gmail.com wrote: -1 for weak references in caching. This makes memory management less deterministic (predictable) and at peak can cause cache-thrashing. In other words, the worst case gets even more worse. When designing a system I want predictability and I want to control the worst case, because system meltdowns are caused by the worst case. Having thousands of small weak references does the opposite. cache trashing is really not that bad (as against system crashing with OOM). Documentation of softReference spcifically mentions caches as one of the applications http://java.sun.com/j2se/1.4.2/docs/api/java/lang/ref/SoftReference.html On Sat, Oct 17, 2009 at 2:00 AM, Noble Paul (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766864#action_12766864 ] Noble Paul commented on SOLR-1513: -- bq.Google Collections is already checked in as a dependency of Carrot clustering. in that e need to move it to core. Jason . We do not need to remove the original option. We can probably add an extra parameter say softRef=true or something. That way , we are not screwing up anything and perf benefits can be studied separately. Use Google Collections in ConcurrentLRUCache Key: SOLR-1513 URL: https://issues.apache.org/jira/browse/SOLR-1513 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: google-collect-snapshot.jar, SOLR-1513.patch ConcurrentHashMap is used in ConcurrentLRUCache. The Google Colletions concurrent map implementation allows for soft values that are great for caches that potentially exceed the allocated heap. Though I suppose Solr caches usually don't use too much RAM? http://code.google.com/p/google-collections/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- Lance Norskog goks...@gmail.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
DIH wiki page reverted
I have reverted the DIH wiki page to revision 212. see this https://issues.apache.org/jira/browse/INFRA-2270 the wiki has not sent any mail yet So all the changes which were made after 212 is lost. Please go through the page and check if your changes are lost. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: another one for 1.4?
let us leave it for 1.5 ... On Tue, Oct 13, 2009 at 2:44 AM, Ryan McKinley ryan...@gmail.com wrote: i say we leave it out... That is a direct mapping of the XML format to JSON. I think discussion suggested we may want to munge the format a bit before baking it into the code. On Oct 12, 2009, at 4:42 PM, Grant Ingersoll wrote: I think we leave it out. On Oct 12, 2009, at 4:27 PM, Erik Hatcher wrote: http://issues.apache.org/jira/browse/SOLR-945 - JSON update handler, comments mention it could/should make it to 1.4. Just came across it again, was curious. Erik -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: RegexTransformer's sourceColName version broken for multiValued fields?
You can open as issue describing the problem On Thu, Oct 8, 2009 at 6:43 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi there, (sorry for maybe posting twice.) This might be a bug - I couldn't find anything on Jira or Google. Versions in use/compared: Solr 1.3 (Nightly 5th August) Nightly 22nd September As RegexTransformer is not different between the two nightlies, the issue probably appeared before. ISSUE: Using RegexTransformer with the sourceColName notation will not populate multiValued (actually containing multiple values) fields with a list but instead add only one value per document. WORKAROUND/WORKING CONFIG: I've just rerun the index with the only difference between the reruns being those following two different usages of RegexTransformer: (Both fields are of type solr.StrField and multiValued.) was working with 1.3, but not with nightly 22nd Sept: field column=participant sourceColName=person regex=([^\|]+)\|.* / field column=role sourceColName=person regex=[^\|]+\|\d+,\d+,\d+,(.*) / this works with the nightly from 22nd Sept: field column=person groupNames=participant,role regex=([^\|]+)\|\d+,\d+,\d+,(.*) / Comparing the source code of RegexTransformer 1.3 vs. 22nd Sept, I found: for (Object result : results) row.put(col, result); (lines 106-107 of transformRow() 22nd of Sept) This looks like the list items are added using the same key over and over again which would explain that there is no list but only one item left in the end. Cheers! Chantal -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: wiki problem
I have opened an issue https://issues.apache.org/jira/browse/INFRA-2270 2009/10/7 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: On Wed, Oct 7, 2009 at 12:03 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : our wiki is behaving very strangely after the new upgrade. Many of the : edits result in deletion of data . What is worse is that it is not : even possible to revert. : : for instance ,I need to revert the page : http://wiki.apache.org/solr/DataImportHandler to revision 212 and I am : unsuccessful after 5 attempts. Can you open an INFRA issue for this in Jira and describe exactly what you tried to do with reverting (the one time i tried reverting post upgrade it worked fine, but just didnt' send email) I shall do that It would also probably be helpful if someone who was the editors on one of these edits where huge chunks of text disappear could comment on how they were doing the edit (text mode or gui), and wether they verified that all the content was there when the editor loaded and whne they previews. I have first hand experience. I was using text mode (this may be some weird side effect of the format conversion that only pops up when using the GUI editor, or on long pages, etc...) long page must be the reason BUt the bottom line is: the only people in a possition to help are the admins on INFRA, so issues need to be opened. -Hoss -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [Solr Wiki] Update of DataImportHandler by FergusMcMenemie
The wiki has eaten up a lot of documentation On Tue, Oct 6, 2009 at 1:54 PM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The DataImportHandler page has been changed by FergusMcMenemie: http://wiki.apache.org/solr/DataImportHandler?action=diffrev1=212rev2=213 xpath=/a/b/subje...@qualifier='fullTitle'] xpath=/a/b/subject/@qualifier xpath=/a/b/c + }}} + ! new for [[Solr1.4]] + {{{ + xpath=//a/... + xpath=/a//b... }}} @@ -768, +773 @@ document entity name=f processor=FileListEntityProcessor baseDir=/some/path/to/files fileName=.*xml newerThan='NOW-3DAYS' recursive=true rootEntity=false dataSource=null entity name=x processor=XPathEntityProcessor forEach=/the/record/xpath url=${f.fileAbsolutePath} + field column=full_name xpat0Aand can be used as a !DataSource. It must be3A//abc.com/a.txt dataSource=data-source-name + !-- copies the text to a field called 'text' in Solr-- + field column=plainText name=text/ - field column=full_name xpath=/field/xpath/ - /entity - /entity - /document - /dataConfig - }}} - Do not miss the `rootEntity` attribute. The implicit fields generated by the !FileListEntityProcessor are `fileAbsolutePath, fileSize, fileLastModified, fileName` and these are available for use within the entity X as shown above. It should be noted that !FileListEntityProcessor returns a list of pathnames and that the subsequent entity must use the !FileDataSource to fetch the files content. - - === CachedSqlEntityProcessor === - Anchor(cached) - - This is an extension of the !SqlEntityProcessor. This !EntityProcessor helps reduce the no: of DB queries executed by caching the rows. It does not help to use it in the root most entity because only one sql is run for the entity. - - Example 1. - {{{ - entity name=x query=select * from x - entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor - /entity - entity + /entity }}} - The usage is exactly same as the other one. When a query is run the results are stored and if the same query is run again it is fetched from the cache and returned + Ensure that the dataSource is of type !DataSourceReader (!FileDataSource, URL!DataSource) - Example 2: - {{{ - entity name=x query=select * from x - entity name=y query=select * from y processor=CachedSqlEntityProcessor where=xid=x.id - /entity - entity - }}} - - The difference with the previous one is the 'where' attribute. In this case the query fetches all the rows from the table and stores all the rows in the cache. The magic is in the 'where' value. The cache stores the values with the 'xid' value in 'y' as the key. The value for 'x.id' is evaluated every time the entity has to be run and the value is looked up in the cache an the rows are returned. - - In the where the lhs (the part before '=') is the column in y and the rhs (the part after '=') is the value to be computed for looking up the cache. - - === PlainTextEntityProcessor === + === LineEntityProcessor === - Anchor(plaintext) + Anchor(LineEntityProcessor) ! [[Solr1.4]] - This !EntityProcessor reads all content from the data source into an single implicit field called 'plainText'. The content is not parsed in any way, however you may add transformers to manipulate the data within 'plainText' as needed or to create other additional fields. + This !EntityProcessor reads all content from the data source on a line by line basis, a field called 'rawLine' is returned for each line read. The content is not parsed in any way, however you may add transformers to manipulate the data within 'rawLine' or to create other additional fields. + The lines read can be filtered by two regular expressions '''acceptLineRegex''' and '''omitLineRegex'''. + This entities additional attributes are: + * '''`url`''' : a required attribute that specifies the location of the input file in a way that is compatible with the configured datasource. If this value is relative and you are using !FileDataSource or URL!DataSource, it assumed to be relative to '''baseLoc'''. + * '''`acceptLineRegex`''' :an optional attribute that if present discards any line which does not match the regExp. + * '''`omitLineRegex`''' : an optional attribute that is applied after any acceptLineRegex and discards any line which matches this regExp. example: {{{ - entity processor=PlainTextEntityProcessor name=x url=http://abc.com/a.txt; dataSource=data-source-name + entity name=jc + processor=LineEntityProcessor + acceptLineRegex=^.*\.xml$ + omitLineRegex=/obsolete + url=file:///Volumes/ts/files.lis + rootEntity=false + dataSource=myURIreader1 +
Re: [Solr Wiki] Update of DataImportHandler by FergusMcMenemie
we will have to revert to rev=212 because the wiki has removed a lot of data On Tue, Oct 6, 2009 at 3:25 PM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The DataImportHandler page has been changed by FergusMcMenemie: http://wiki.apache.org/solr/DataImportHandler?action=diffrev1=213rev2=214 document entity name=f processor=FileListEntityProcessor baseDir=/some/path/to/files fileName=.*xml newerThan='NOW-3DAYS' recursive=true rootEntity=false dataSource=null entity name=x processor=XPathEntityProcessor forEach=/the/record/xpath url=${f.fileAbsolutePath} + field column source]] + - field column=full_name xpat0Aand can be used as a !DataSource. It must be3A//abc.com/a.txt dataSource=data-source-name + and can be used as a !DataSource. It must be3A//abc.com/a.txt dataSource=data-source-name !-- copies the text to a field called 'text' in Solr-- field column=plainText name=text/ /entity @@ -872, +874 @@ }}} === HttpDataSource === - ! Http!DataSource is being deprecated in favour of URL!DataSource in [[Solr1.4]]. There is no change in functionality between URL!DataSource and !Http!DataSource, only a name change. + ! Http!DataSource is being deprecated in favour of URLDataSource in [[Solr1.4]]. There is no change in functionality between URLDataSource and !Http!DataSource, only a name change. === FileDataSource === This can be used like an URL!DataSource but used to fetch content from files on disk. The only difference from URL!DataSource, when accessing disk files, is how a pathname is specified. The signature is as follows @@ -887, +889 @@ === FieldReaderDataSource === ! [[Solr1.4]] - This can be used like an URL!DataSource . The signature is as follows + This can be used like an URLDataSource. But instead of reading from a file:// or http:// location the entity parses the contents of a field fetched by another !EntityProcessor. For instance an outer !EntityProcessor could be fetching fields from a DB where one of the fields contains XML. The field containing XML could be processed by an inner XPathEntityProcessor. The signature is as follows {{{ public class FieldReaderDataSource extends DataSourceReader }}} - This can be useful for users who have a DB field containing XML and wish to use a nested X!PathEntityProcessor to process the fields contents. + This can be useful for users who have a DB field containing XML and wish to use a nested XPathEntityProcessor to process the fields contents. The datasouce may be configured as follows {{{ datasource name=f type=FieldReaderDataSource / }}} - The enity which uses this datasource must keep the url value as the variable name dataField=field-name. For instance , if the parent entity 'dbEntity' has a field called 'xmlData' . Then he child entity woould look like, + The entity which uses this datasource must specify the variable using the name dataField=field-name. For instance, if the parent entity 'dbEntity' has a field called 'xmlData'. Then he child entity would look like, {{{ entity dataSource=f processor=XPathEntityProcessor dataField=dbEntity.xmlData/ }}} -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Highlighting bean properties using DocumentObjectBinder - New feature?
go ahead but mark for 1.5 On Tue, Oct 6, 2009 at 4:50 PM, Avlesh Singh avl...@gmail.com wrote: Does this one deserve a JIRA issue? Cheers Avlesh On Sun, Oct 4, 2009 at 8:37 PM, Avlesh Singh avl...@gmail.com wrote: Like most others, I use SolrJ and bind my beans with @Field annotations to read responses from Solr. For highlighting these properties in my bean, I always write a separate piece - Get the list of highlights from response and then use the MapfieldName, Listhighlights to put them back in my original bean. This evening, I tried creating an @Highlight annotation and modified the DocumentObjectBinder to understand this attribute (with a bunch of other properties). This is how it works: You can annotate your beans with @Highlight as underneath. class MyBean{ @Field @Highlight String name; @Field (solr_category_field_name) ListString categories; @Highlight (solr_category_field_name) ListString highlightedCategories @Field float score; ... } and use QueryResponse#getBeans(MyBean.class) to achieve both - object binding as well as highlighting. I was wondering if this can be of help to most users or not. Can this be a possible enhancement in DocumentObjectBinder? If yes, I can write a patch. Cheers Avlesh -- - Noble Paul | Principal Engineer| AOL | http://aol.com
wiki problem
our wiki is behaving very strangely after the new upgrade. Many of the edits result in deletion of data . What is worse is that it is not even possible to revert. for instance ,I need to revert the page http://wiki.apache.org/solr/DataImportHandler to revision 212 and I am unsuccessful after 5 attempts. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1426) Allow delta-import to run continously until aborted
the lastIndexTime is removed from DataImporter because it is redundant. On Thu, Oct 1, 2009 at 11:42 PM, Abdul Chaudhry (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761289#action_12761289 ] Abdul Chaudhry commented on SOLR-1426: -- NOTE: the last_index_time is broken with the perpetual patch I hacked around this by changing the data-config.xml file for the deltaQuery to do something like this:- WHERE updated_at DATE_SUB('${dataimporter.last_index_time}',INTERVAL 10 SECOND) This is because of the time discrepancy between the sleep and the writers last_index_time. However, it looks like the delta-import is broken in the latest build of solr trunk revision 820731. It looks like the lastIndexTime in the DataImporter is not populated after a delta and so if you used ${dataimporter.last_index_time} then the deltaQuery uses the wrong time. I am going to wait until delta-import is fixed before I update a patch. Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Assignee: Noble Paul Fix For: 1.5 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1482) Solr master and slave freeze after query
What is your index size? if you have enough RAM , try out a bigger perm gen size. If you shard, you will not be able to use the same data dir. The data has to be split among shards. Which mean reindexing. --Noble On Fri, Oct 2, 2009 at 7:20 AM, Artem Russakovskii (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761449#action_12761449 ] Artem Russakovskii commented on SOLR-1482: -- I'm getting an error even just trying to access a single shard's admin interface, even after adjusting -XX:MaxPermSize=512m {quote} == catalina.out == Oct 1, 2009 6:47:06 PM org.apache.coyote.http11.Http11Processor process SEVERE: Error processing request java.lang.OutOfMemoryError: PermGen space at java.lang.Throwable.getStackTraceElement(Native Method) at java.lang.Throwable.getOurStackTrace(Throwable.java:591) at java.lang.Throwable.printStackTrace(Throwable.java:510) at java.util.logging.SimpleFormatter.format(SimpleFormatter.java:72) at org.apache.juli.FileHandler.publish(FileHandler.java:129) at java.util.logging.Logger.log(Logger.java:458) at java.util.logging.Logger.doLog(Logger.java:480) at java.util.logging.Logger.logp(Logger.java:680) at org.apache.juli.logging.DirectJDKLog.log(DirectJDKLog.java:167) at org.apache.juli.logging.DirectJDKLog.error(DirectJDKLog.java:135) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:324) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Thread.java:619) {quote} :-/ Solr master and slave freeze after query Key: SOLR-1482 URL: https://issues.apache.org/jira/browse/SOLR-1482 Project: Solr Issue Type: Bug Affects Versions: 1.4 Environment: Nightly 9/28/09. 14 individual instances per server, using JNDI. replicateAfter commit, 5 min interval polling. All caches are currently commented out, on both slave and master. Lots of ongoing commits - large chunks of data, each accompanied by a commit. This is to guarantee that anything we think is now in Solr remains there in case the server crashes. Reporter: Artem Russakovskii Priority: Critical We're having issues with the deployment of 2 master-slave setups. One of the master-slave setups is OK (so far) but on the other both the master and the slave keep freezing, but only after I send a query to them. And by freezing I mean indefinite hanging, with almost no output to log, no errors, nothing. It's as if there's some sort of a deadlock. The hanging servers need to be killed with -9, otherwise they keep hanging. The query I send queries all instances at the same time using the ?shards= syntax. On the slave, the logs just stop - nothing shows up anymore after the query is issued. On the master, they're a bit more descriptive. This information seeps through very-very slowly, as you can see from the timestamps: {quote} SEVERE: java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:16:00 PM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:19:37 PM org.apache.catalina.connector.CoyoteAdapter service SEVERE: An exception or error occurred in the container during the request processing java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:19:37 PM org.apache.coyote.http11.Http11Processor process SEVERE: Error processing request java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:19:39 PM org.apache.catalina.connector.CoyoteAdapter service SEVERE: An exception or error occurred in the container during the request processing java.lang.OutOfMemoryError: PermGen space Exception in thread ContainerBackException in thread pool-29-threadOct 1, 2009 2:21:47 PM org.apache.catalina.connector.CoyoteAdapter service SEVERE: An exception or error occurred in the container during the request processing java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:21:47 PM org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process SEVERE: Error reading request, ignored java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:21:47 PM org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process SEVERE: Error reading request, ignored java.lang.OutOfMemoryError: PermGen space -22 java.lang.OutOfMemoryError: PermGen space Oct 1, 2009 2:21:47 PM org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process SEVERE: Error reading
Re: [jira] Commented: (SOLR-1335) load core properties from a properties file
then it is the same On Thu, Sep 24, 2009 at 1:34 PM, Artem Russakovskii (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759059#action_12759059 ] Artem Russakovskii commented on SOLR-1335: -- We're using single core. load core properties from a properties file --- Key: SOLR-1335 URL: https://issues.apache.org/jira/browse/SOLR-1335 Project: Solr Issue Type: New Feature Reporter: Noble Paul Assignee: Noble Paul Fix For: 1.4 Attachments: SOLR-1335.patch, SOLR-1335.patch, SOLR-1335.patch, SOLR-1335.patch There are few ways of loading properties in runtime, # using env property using in the command line # if you use a multicore drop it in the solr.xml if not , the only way is to keep separate solrconfig.xml for each instance. #1 is error prone if the user fails to start with the correct system property. In our case we have four different configurations for the same deployment . And we have to disable replication of solrconfig.xml. It would be nice if I can distribute four properties file so that our ops can drop the right one and start Solr. Or it is possible for the operations to edit a properties file but it is risky to edit solrconfig.xml if he does not understand solr I propose a properties file in the instancedir as solrcore.properties . If present would be loaded and added as core specific properties. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
on vacation
I am on vacation for the next 4 days .I would try to stay away from my mailbox. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr nightly build failure
I guess yes. But in this case it should have retried. And probably the next attempt would have succeeded. On Wed, Sep 23, 2009 at 7:48 PM, Yonik Seeley yo...@lucidimagination.com wrote: Exception that caused the failure shown below... is this an exception that should be handled internally by LBHttpSolrServer? -Yonik http://www.lucidimagination.com testcase classname=org.apache.solr.client.solrj.TestLBHttpSolrServer name=testSimple time=22.99/testcase testcase classname=org.apache.solr.client.solrj.TestLBHttpSolrServer name=testTwoServers time=7.81 error message=java.lang.IllegalStateException: Connection is not open type=org.apache.solr.client.solrj.SolrServerExceptionorg.apache.solr.client.solrj.SolrServerException: java.lang.Ille galStateException: Connection is not open at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:217) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) at org.apache.solr.client.solrj.TestLBHttpSolrServer.testTwoServers(TestLBHttpSolrServer.java:130) Caused by: java.lang.IllegalStateException: Connection is not open at org.apache.commons.httpclient.HttpConnection.assertOpen(HttpConnection.java:1277) at org.apache.commons.httpclient.HttpConnection.getResponseInputStream(HttpConnection.java:858) at org.apache.commons.httpclient.HttpMethodBase.readResponseHeaders(HttpMethodBase.java:1935) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1737) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:205) /error /testcase On Wed, Sep 23, 2009 at 4:41 AM, solr-dev@lucene.apache.org wrote: init-forrest-entities: [mkdir] Created dir: /tmp/apache-solr-nightly/build [mkdir] Created dir: /tmp/apache-solr-nightly/build/web compile-solrj: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj [javac] Compiling 86 source files to /tmp/apache-solr-nightly/build/solrj [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solr [javac] Compiling 387 source files to /tmp/apache-solr-nightly/build/solr [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compileTests: [mkdir] Created dir: /tmp/apache-solr-nightly/build/tests [javac] Compiling 176 source files to /tmp/apache-solr-nightly/build/tests [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. solr-cell-example: init: [mkdir] Created dir: /tmp/apache-solr-nightly/contrib/extraction/build/classes [mkdir] Created dir: /tmp/apache-solr-nightly/build/docs/api init-forrest-entities: compile-solrj: compile: [javac] Compiling 1 source file to /tmp/apache-solr-nightly/build/solr [javac] Note: /tmp/apache-solr-nightly/src/java/org/apache/solr/search/DocSetHitCollector.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. make-manifest: [mkdir] Created dir: /tmp/apache-solr-nightly/build/META-INF compile: [javac] Compiling 6 source files to /tmp/apache-solr-nightly/contrib/extraction/build/classes [javac] Note: /tmp/apache-solr-nightly/contrib/extraction/src/main/java/org/apache/solr/handler/extraction/ExtractingDocumentLoader.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. build: [jar] Building jar:
Re: [Solr Wiki] Update of DataImportHandler by NoblePaul
the wiki screwed up the page. It has chopped off most of the content and now i am unable to roll it back to the previous version (208). On Mon, Sep 21, 2009 at 2:50 PM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The DataImportHandler page has been changed by NoblePaul: http://wiki.apache.org/solr/DataImportHandler?action=diffrev1=208rev2=209 {{{ dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/dbname user=db_username password=db_password/ }}} - * The datasource configuration can also be done in solr config xml [[#solrconfigdatasource]] * The attribute 'type' specifies the implementation class. It is optional. The default value is `'JdbcDataSource'` * The attribute 'name' can be used if there are [[#multipleds|multiple datasources]] used by multiple entities * All other attributes in the dataSource tag are specific to the particular dataSource implementation being configured. @@ -679, +678 @@ {{{ requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler + lst na0D - lst name=defaults - str name=configdata-config.xml/str - /lst - lst name=invariants - !-- Pass through the prefix which needs stripped from - an absolute disk path to give an absolute web path -- - str name=img_installdir/usr/local/apache2/htdocs/str - /lst - /requestHandler - }}} - - - {{{ - dataConfig - dataSource name=myfilereader type=FileDataSource/ - document - entity name=jc rootEntity=false dataSource=null - processor=FileListEntityProcessor - fileName=^.*\.xml$ recursive=true - baseDir=/usr/local/apache2/htdocs/imagery - - entity name=xrootEntity=true - dataSource=myfilereader - processor=XPathEntityProcessor - url=${jc.fileAbsolutePath} - stream=false forEach=/mediaBlock - transformer=DateFormatTransformer,TemplateTransformer,RegexTransformer,LogTransformer - logTemplate= processing ${jc.fileAbsolutePath} - logLevel=info - - - field column=fileAbsPath template=${jc.fileAbsolutePath} / - - field column=fileWebPath template=${x.fileAbsolutePath} - regex=${dataimporter.request.img_installdir}(.*) replaceWith=$1/ - - field column=fileWebDir regex=^(.*)/.* replaceWith=$1 sourceColName=fileWebPath/ - - field column=imgFilename xpath=/mediaBlock/@url / - field column=imgCaption xpath=/mediaBlock/caption / - field column=imgSrcArticle xpath=/mediaBlock/source - template=${x.fileWebDir}/${x.imgSrcArticle}// - - field column=uid regex=^(.*)$ replaceWith=$1#${x.imgFilename} sourceColName=fileWebPath/ - - !-- if imgFilename is not defined all the following will also not be defined -- - field column=imgWebPathFULL template=${x.fileWebDir}/images/${x.imgFilename}/ - field column=imgWebPathICON regex=^(.*)\.\w+$ replaceWith=${x.fileWebDir}/images/s$1.png - sourceColName=imgFilename/ - - /entity - /entity - /document - /dataConfig - }}} - - Anchor(custom-transformers) - === Writing Custom Transformers === - It is simple to add you own transformers and this documented on the page [[DIHCustomTransformer]] - - Anchor(entityprocessor) - == EntityProcessor == - Each entity is handled by a default Entity processor called !SqlEntityProcessor. This works well for systems which use RDBMS as a datasource. For other kind of datasources like REST or Non Sql datasources you can choose to extend this abstract class `org.apache.solr.handler.dataimport.Entityprocessor`. This is designed to Stream rows one by one from an entity. The simplest way to implement your own !EntityProcessor is to extend !EntityProcessorBase and override the `public MapString,Object nextRow()` method. - '!EntityProcessor' rely on the !DataSource for fetching data. The return type of the !DataSource is important for an !EntityProcessor. The built-in ones are, - - === SqlEntityProcessor === - This is the defaut. The !DataSource must be of type `DataSourceIteratorMapString, Object` . !JdbcDataSource can be used with this. - - === XPathEntityProcessor === - Used when indexing XML type data. The !DataSource must be of type `DataSourceReader` . URL!DataSource ! [[Solr1.4]] or !FileDataSource is commonly used with X!PathEntityProcessor. - - === FileListEntityProcessor === - A simple entity processor which can be used to enumerate the list of files from a
Re: Solr API
SolrQuery q = new SolrQuery().setParam(qt,/dataimport).setParam(command, full-import); solrServer.query(q); On Tue, Sep 15, 2009 at 11:32 AM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Hi Sir, still facing problem.. i cannot understand how to provide the command http://localhost:8983/solr/db/dataimport?command=full-import.. can anybody plz help me out??? - Original Message - From: Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com To: solr-dev@lucene.apache.org Sent: Monday, September 14, 2009 5:26 PM Subject: Re: Solr API SolrJ can be used to make any name value request to Solr. use the SolrQuery#set(name,val) On Mon, Sep 14, 2009 at 4:47 PM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Yes Sir.. SolrJ API... Regards Asish - Original Message - From: Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com To: solr-dev@lucene.apache.org Sent: Monday, September 14, 2009 4:40 PM Subject: Re: Solr API did you mean SolrJ API? On Mon, Sep 14, 2009 at 4:15 PM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Hi, I just want to write a Solr API for full-import. Can anybody please help me out??? It's very urgent. Regards Asish -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Problem with DataImportHandler and JDBC
the parameter name is batchSize by default it is set as 500 On Tue, Sep 15, 2009 at 7:27 PM, Luc Caprini luc.capr...@sun.com wrote: Hi, I'm trying using SOLR 1.4. and in my first test, I've got an issue with the DataImportHandler ... Config : SOLR 1.4 on tomcat 5.5.27 with mysql driver 5.1.7 on mysql 5.1 My dataconfig is dataConfig dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/testa user=root password=password/ document entity name=a query=select * from a field column=b name=named/ /entity /document /dataConfig And it generates this error Unable to execute query: select * from a Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:251) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:208) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39) at org.apache.solr.handler.dataimport.DebugLogger$2.getData(DebugLogger.java:184) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1301) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Unknown Source) Caused by: java.sql.SQLException: * Illegal value for setFetchSize()*. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926) at com.mysql.jdbc.StatementImpl.setFetchSize(StatementImpl.java:2404) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:240) ... 30 more/str I tried to add fetchsize as a parameter : dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/testa user=root password=password fetchSize=1/ but result is the same. Thanks in advance for your assistance Kind regards Luc -- http://www.sun.com * Luc Caprini * Client Solutions Architect *Sun Microsystems, Inc.* 13 Avenue Morane Saulnier Velizy 78140 France Phone +33 (0) 1 34 03 00 20 Mobile +33 (0) 6 12 30 16 22 Fax +33 (0) 1 34 03 10 11 Email luc.capr...@sun.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr 1.4 Open Issues Status
On Tue, Sep 15, 2009 at 7:44 PM, Grant Ingersoll gsing...@apache.org wrote: Here's where we are at for 1.4. My comments are marked by . I think we are in pretty good shape, people just need to make some final commits. If things are still unassigned tomorrow morning, I'm going to push them to 1.5. Key Summary Assignee SOLR-1427 SearchComponents aren't listed on registry.jsp Grant Ingersoll I just put up a patch that I believe is ready to commit. SOLR-1423 Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream others Koji Sekiguchi Koji? SOLR-1407 SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters) Shalin Shekhar Mangar Needs a patch and a unit test. Push to 1.5? SOLR-1396 standardize the updateprocessorchain syntax Unassigned No patch exists and no work has been done on it. Seems like we should get this right. Volunteers? Let us push it to 1.5 . Anyway there are more things to be clened up as a part of SOLR-1198 SOLR-1366 UnsupportedOperationException may be thrown when using custom IndexReader Mark Miller Patch exists. Mark? SOLR-1319 Upgrade custom Solr Highlighter classes to new Lucene Highlighter API Mark Miller No patch. Mark? SOLR-1314 Upgrade Carrot2 to version 3.1.0 Grant Ingersoll Waiting on Lucene 2.9 and a minor licensing issue. SOLR-1300 need to exlcude downloaded clustering libs from release packages Grant Ingersoll This will be handled during release packaging. SOLR-1294 SolrJS/Javascript client fails in IE8! Unassigned I have concerns about this client library being included at all in Solr, as I don't see anyone taking it up for maintenance. I raised concerns on the main issue with no response and likewise with this one. Patch exists. Who handled the original contribution? SOLR-1292 show lucene fieldcache entries and sizes Mark Miller Hoss' patch is a reasonable start. I think this can be committed. We can iterate in 1.5. Mark or Hoss? SOLR-1221 Change Solr Highlighting to use the SpanScorer with MultiTerm expansion by default Mark Miller Mark? Seems ready to go. SOLR-1170 Java replication replicates lucene lock file Noble Paul Noble? Has a patch that looks reasonable for now. Might be problematic if Lucene ever changes the extension of the lock files. I am not sure if there is a problem. Because lucene should not be returning the .lock file name in the list of files. Anyway the chances of Lucene changing the file extension is slim. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Problem with DataImportHandler and JDBC
refer this http://wiki.apache.org/solr/DataImportHandlerFaq#head-149779b72761ab071c841879545256bdbbdc15d2 On Tue, Sep 15, 2009 at 11:40 PM, Luc Caprini luc.capr...@sun.com wrote: Hi Thanks for your quick answer. it is ok now... In fact, I try to read BLOB from my SQL Database. I'm trying the simplest test i can by only trying to read a blob column in a table. So my dataconfig is : dataConfig dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/testa user=root password=password batchSize=1/ document entity name=a transformer=ClobTransformer query=select * from a field column=a name=mycol clob=true / /entity /document /dataConfig Result is : ?xml version=1.0 encoding=UTF-8 ? - # response - # lst name=*responseHeader* int name=*status*0/int int name=*QTime*70/int /lst - # lst name=*initArgs* - # lst name=*defaults* str name=*config*data-conf.xml/str /lst /lst str name=*command*full-import/str str name=*mode*debug/str null name=*documents* / - # lst name=*verbose-output* - # lst name=*entity:a* - # lst name=*document#1* str name=*query*select * from a/str str name=*time-taken*0:0:0.40/str str--- row #1-/str str name=*a*[B:[...@3f4ebd/str str-/str - # lst name=*transformer:ClobTransformer* str-/str str name=*a*[B:[...@3f4ebd/str str-/str /lst /lst lst name=*document#1* / /lst /lst str name=*status*idle/str str name=*importResponse*Configuration Re-loaded sucessfully/str - # lst name=*statusMessages* str name=*Total Requests made to DataSource*1/str str name=*Total Rows Fetched*1/str str name=*Total Documents Skipped*0/str str name=*Full Dump Started*2009-09-15 20:06:12/str str name=*Time taken*0:0:0.60/str /lst str name=*WARNING*This response format is experimental. It is likely to change in the future./str /response As you can see, my column is not transformed. I read the code for ClobTransformer, and it seems that this class is only able to read Clob and not Blob. Am I wrong ? How i can achieve my goal ? Thanks in advance Regards Luc Le 15/09/2009 16:42, Noble Paul ??? ?? a écrit : the parameter name is batchSize by default it is set as 500 On Tue, Sep 15, 2009 at 7:27 PM, Luc Capriniluc.capr...@sun.com wrote: Hi, I'm trying using SOLR 1.4. and in my first test, I've got an issue with the DataImportHandler ... Config : SOLR 1.4 on tomcat 5.5.27 with mysql driver 5.1.7 on mysql 5.1 My dataconfig is dataConfig dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/testa user=root password=password/ document entity name=a query=select * from a field column=b name=named/ /entity /document /dataConfig And it generates this error Unable to execute query: select * from a Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:251) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:208) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39) at org.apache.solr.handler.dataimport.DebugLogger$2.getData(DebugLogger.java:184) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1301) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at
Re: Solr API
did you mean SolrJ API? On Mon, Sep 14, 2009 at 4:15 PM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Hi, I just want to write a Solr API for full-import. Can anybody please help me out??? It's very urgent. Regards Asish -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr API
SolrJ can be used to make any name value request to Solr. use the SolrQuery#set(name,val) On Mon, Sep 14, 2009 at 4:47 PM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Yes Sir.. SolrJ API... Regards Asish - Original Message - From: Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com To: solr-dev@lucene.apache.org Sent: Monday, September 14, 2009 4:40 PM Subject: Re: Solr API did you mean SolrJ API? On Mon, Sep 14, 2009 at 4:15 PM, Asish Kumar Mohanty amoha...@del.aithent.com wrote: Hi, I just want to write a Solr API for full-import. Can anybody please help me out??? It's very urgent. Regards Asish -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Created: (SOLR-1392) NPE on replication page on slave
By any chance can you share that file _index__jsp.java ? On Fri, Aug 28, 2009 at 7:32 PM, Reuben Firmin (JIRA)j...@apache.org wrote: NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close ‹��íVM Ó0=_чìÅi¶ßȪ„ �...@Õn%Ž'ë Ç6ŽÓ-ÿÇqi‹S-àËHã7ïÅ3vžŒSòe„ ùf8,'�...@ÔœgÑ9ĉأÏ6ßhè÷G Ê0)¢WIJ§J¡Œä%(Ó 8£¤QÆDǬÅmx}`âËõFÿÍýtr礨,%5-$jÀOÜ = Ê3 -8Ú4íd¹eÉufLfó€ÒT F9«1´pôŒÛ6W±-å²Íâߧ˜.îo)ÃŽ(...ÞyáNÍ.Ðé.f/n´'³é~j8DÚ1 à ]ïõ¾×p™�...@¯kÅÑw©Ÿ‰Îãn´•žÿ•t:ôgõE,Š*ɵÓ- ]¢G›\ÏñßÌ߸ñ‹ómë; -„œêlzKºm¯šØ) më›×ÕOõìÞ*Õ)Bæü ófe}DmðGŒ r‹/'5...^à¥Ôm©ZI!€º¯ëZÄ×+í O¢6Œ£m¡Ím¤ä¯×ʆ¿%Õ- *Þ[á...¾¥'So g÷lK.ªO•‹»'ëâÄyp„w2ÿÑ8rÚqoĽ÷FÜñ 2bk¹Vƒd‰mc„«'pû:~€Š‰d„R4²U~'Ψ-˽ à?...I�� -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [jira] Commented: (SOLR-1392) NPE on replication page on slave
does it work if you hit the url http://master/replication directly? On Sat, Aug 29, 2009 at 3:56 AM, Reuben Firmin (JIRA)j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749008#action_12749008 ] Reuben Firmin commented on SOLR-1392: - There's some issue on the master. What does host mean in this context? http://master/replication?command=detailswt=xml java.lang.IllegalArgumentException: host parameter is null at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:206) at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:155) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionWithReference.init(MultiThreadedHttpConnectionManager.java:1145) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool.createConnection(MultiThreadedHttpConnectionManager.java:762) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:476) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:192) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:187) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:589) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:180) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 22:22:53 GMT Server: Apache/2.2.3 (Red Hat) Cache-Control: no-cache, no-store Pragma: no-cache Expires: Sat, 01 Jan 2000 01:00:00 GMT Content-Type: text/html; charset=UTF-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 713 Connection: close NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at