Re: Solr Php Client

2011-04-08 Thread Haspadar
I'm entering only a query parameter. I posted a bug description there - http://pecl.php.net/bugs/bug.php?id=22634 2011/4/8 Israel Ekpo israele...@gmail.com Hi, Could you send the enter list of parameters you are ending to solr via the SolrClient and SolrQuery object? Please open a bug

Re: how to set cookie for url requesting in stream_url

2011-04-08 Thread satya swaroop
Hi All, I was able to set the cookie value to the Stream_url connection, i was able to pass the cookie value upto contentstreamBase.URLStream class and i added conn.setRequestProperty(Cookie,cookie[0].name=cookie[0].value) in the connection setup.. and it is working fine now... Regards,

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Albert Vila
Ephraim, I still can't view the document. Don't know if I'm doing something wrong, but I downloaded it and It appears to be empty. Albert On 7 April 2011 09:32, Ephraim Ofir ephra...@icq.com wrote: You can't view it online, but you should be able to download it from:

Re: Lucid Works

2011-04-08 Thread Andrzej Bialecki
On 4/7/11 10:16 PM, Mark wrote: Andrezej, Thanks for the info. I have a question regarding stability though. How are you able to guarantee the stability of this release when 4.0 is still a work in progress? I believe the last version Lucid released was 1.4 so why did you choose to release a 4.x

StreamingUpdateSolrServer and PHP

2011-04-08 Thread stockii
is it possible to use StreamingUpdateSolrServer with a php application ? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests -

Re: Sourcesense packager

2011-04-08 Thread Simone Tripodi
Hi Mark, thanks for your interest!!! That's a feature we haven't had the the time to work on (yet ;)) As workaround, what I can suggest you is modifying the tomcat-users.xml file inside the produced tomcat, once unzipped. HTH, please let me know!!! Have a nice day, Simo

Re: Using MLT feature

2011-04-08 Thread lboutros
It seems that tokens are sorted by frequencies : ... Collections.sort(profile, new TokenComparator()); ... and private static class TokenComparator implements ComparatorToken { public int compare(Token t1, Token t2) { return t2.cnt - t1.cnt; } and cnt is the token count.

RE: Using MLT feature

2011-04-08 Thread Frederico Azeiteiro
Hi. Yes, I manage to create a stable comparator in c# for profile. The problem is before that on: ... tokens.put(s, tok); ... Imagine you have 2 tokens with the same frequency, on the stable sort comparator for profile it will maintain the original order. The problem is that the original

Re: Using MLT feature

2011-04-08 Thread lboutros
Couldn't you extend the TextProfileSignature and modify the TokenComparator class to use lexical order when token have the same frequency ? Ludovic. 2011/4/8 Frederico Azeiteiro [via Lucene] ml-node+2794604-1683988626-383...@n3.nabble.com Hi. Yes, I manage to create a stable comparator in

RE: Using MLT feature

2011-04-08 Thread Frederico Azeiteiro
Yes, i guess that could be an option, but I'm not very experienced with Java development and SOLR modifications. As my main goal was to create a similar sig in C#, I just use the c# method to create the sig myself before indexing instead of SOLR Deduplicate function. That way, when searching I

Re: Tips for getting unique results?

2011-04-08 Thread Shaun Campbell
Pete Surely the default sort order for facets is by descending count order. See http://wiki.apache.org/solr/SimpleFacetParameters. If your results are really sorted in ascending order can't you sort them externally eg Java? Hope that helps. Shaun

Re: UIMA example setup w/o OpenCalais

2011-04-08 Thread Tommaso Teofili
Hi Jay, you should be able to do so by simply removing the OpenCalaisAnnotator from the execution pipeline commenting the line 124 of the file: solr/contrib/uima/src/main/resources/org/apache/uima/desc/OverridingParamsExtServicesAE.xml Hope this helps, Tommaso 2011/4/7 Jay Luker

Re: Indexing pdf files - question.

2011-04-08 Thread Mike
Hi Erick, Thank you for the Reply. Now I am able to index the PDF files and search. I am left with couple of questions: 1. Can I add custom field to Search Response XML (Ex: Need to as description which gives brief description about the PDF file). 2. Currently Solr runs as a separate

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread François Schiettecatte
You might also want to look at the heritrix crawler too: http://crawler.archive.org/ I have written three crawlers in the past, all for RSS feeds, it is not easy. Happy to provide tips and help if you want to go down that route. François On Apr 8, 2011, at 1:53 AM, Andrea Campi wrote:

solrj dependency wstx-asl

2011-04-08 Thread Tim Terlegård
solrj has a dependency on wstx-asl. I've successfully used Solr 1.4 maven artifacts for a while and the wstx-asl dependency had the wrong groupId so it's always been missing in my application, but it has still worked fine. Is wstx-asl really needed? Is it only needed in certain circumstances? Is

Tutorial StreamingUpdateSolrServer

2011-04-08 Thread stockii
Hello. i want to change my full-imports from DIH to use of Java and StreamingUpdateSolrServer ... is in the wiki a little how to or something similar ? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Andy
I can't view the document either -- it showed up empty. Has anyone succeeded in viewing it? Andy --- On Fri, 4/8/11, Albert Vila a...@imente.com wrote: From: Albert Vila a...@imente.com Subject: Re: Very very large scale Solr Deployment = how to do (Expert Question)? To:

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Albert Vila
Yes, It won't work if you are using OpenOffice. However it works fine with Microsoft Word. Hope it helps. Albert On 8 April 2011 14:55, Andy angelf...@yahoo.com wrote: I can't view the document either -- it showed up empty. Has anyone succeeded in viewing it? Andy --- On Fri, 4/8/11,

Re: MoreLikeThis match

2011-04-08 Thread Brian Lamb
I've looked at both wiki pages and none really clarify the difference between these two. If I copy and paste an existing index value for field and do an mlt search, it shows up under match but not results. What is the difference between these two? On Thu, Apr 7, 2011 at 2:24 PM, Brian Lamb

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Andy
Could anyone please post a version of the document in pdf or openoffice format? I'm on Linux so there's no way for me to use MS Word. Thanks. --- On Fri, 4/8/11, Albert Vila a...@imente.com wrote: From: Albert Vila a...@imente.com Subject: Re: Very very large scale Solr Deployment = how to

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Pascal Coupet
I dit put a pdf version here: https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B02DHBZQYYT_MmRkZTY0YjQtODJmZS00Mzg0LWJiNTEtOWJjNzViNmNjZjdhhl=enauthkey=CL2Fq_QG Zoom it to get a better view. Pascal 2011/4/8 Andy angelf...@yahoo.com Could anyone please post a version of the

Strip spaces and new line characters from data

2011-04-08 Thread alexei
Hello Everyone, I am getting my integer field data from xml. Some docs fail because of a newline character at the end of the string. I am attempting to strip spaces and new line characters as follows: The above still results in a numberformatexception. Is this the right

RE: Problems indexing very large set of documents

2011-04-08 Thread Brandon Waterloo
I had some time to do some research into the problems. From what I can tell, it appears Solr is tripping up over the filename. These are strictly examples, but, Solr handles this filename fine: 32-130-A0-84-african_activist_archive-a0a6s3-b_12419.pdf However, it fails with either a parsing

Re: Very very large scale Solr Deployment = how to do (Expert Question)?

2011-04-08 Thread Andy
Perfect. Thank you very much. Andy --- On Fri, 4/8/11, Pascal Coupet pcou...@gmail.com wrote: From: Pascal Coupet pcou...@gmail.com Subject: Re: Very very large scale Solr Deployment = how to do (Expert Question)? To: solr-user@lucene.apache.org Date: Friday, April 8, 2011, 10:20 AM I

Surge 2011 CFP Deadline Extended

2011-04-08 Thread Katherine Jeschke
OmniTI is pleased to announce that the CFP deadline for Surge 2011, the Scalability and Performance Conference, (Baltimore: Sept 28-30, 2011) has been extended to 23:59:59 EDT, April 17, 2011. The event focuses upon case studies that demonstrate successes (and failures) in Web applications and

Re: Lucid Works

2011-04-08 Thread Mark
Doesn't look like you allow new members to post questions in that forum. I have just one last question ;) We are deciding whether to upgrade our 1.4 production environment to 4.x or 3.1. What were you decisions when deciding to release 4.x over 3.1? Thanks again On 4/8/11 1:13 AM, Andrzej

Re: Trade Mark symbol(TM) in Index

2011-04-08 Thread Em
Hi, I have to jump into this topic. I can not find the mentioned replies, Markus but I still noticed that problem, too. What could be the cause? Regards, Em Markus Jelsma-2 wrote: You opened the same thread this monday and got two replies. Hi, Has anyone indexed the data with Trade

Re: Problems indexing very large set of documents

2011-04-08 Thread Ezequiel Calderara
Maybe those files are created with a different Adobe Format version... See this: http://lucene.472066.n3.nabble.com/PDF-parser-exception-td644885.html On Fri, Apr 8, 2011 at 12:14 PM, Brandon Waterloo brandon.water...@matrix.msu.edu wrote: A second test has revealed that it is something to do

Re: Problems indexing very large set of documents

2011-04-08 Thread Ezequiel Calderara
Ohh sorry... didn't realize that they already sent you that link :P On Fri, Apr 8, 2011 at 12:35 PM, Ezequiel Calderara ezech...@gmail.comwrote: Maybe those files are created with a different Adobe Format version... See this:

Re: Trade Mark symbol(TM) in Index

2011-04-08 Thread Markus Jelsma
http://lucene.472066.n3.nabble.com/Indexing-data-with-Trade-Mark-Symbol- td2774421.html Hi, I have to jump into this topic. I can not find the mentioned replies, Markus but I still noticed that problem, too. What could be the cause? Regards, Em Markus Jelsma-2 wrote: You opened

Special characters during indexing and searching

2011-04-08 Thread alexw
Hi, I have a field named productName in my schema which uses the standard text field type. And one of my product name is star/bit. When I search for star/bit (without quotes) using the dismax request hander, NO results was found. After some research, looks like during indexing, star/bit was

RE: Problems indexing very large set of documents

2011-04-08 Thread Brandon Waterloo
I think I've finally found the problem. The files that work are PDF version 1.6. The files that do NOT work are PDF version 1.4. I'll look into updating all the old documents to PDF 1.6. Thanks everyone! ~Brandon Waterloo From: Ezequiel Calderara

Re: How to index PDF file stored in SQL Server 2008

2011-04-08 Thread Darx Oman
Hi there TikaEntityProcessor is available as part of DIH-extras*.jar in 3.x and 4.0

Re: How to index PDF file stored in SQL Server 2008

2011-04-08 Thread Darx Oman
Hi again what you are missing is field mapping field column=id name=id / no need for TikaEntityProcessor since you are not accessing pdf files

Re: Strip spaces and new line characters from data

2011-04-08 Thread Erick Erickson
Your schema stuff didn't come through, possibly your mail server is removing it. But two things come to mind. First, Solr has a trimfilterfactory, see: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.TrimFilterFactory

Re: Lucid Works

2011-04-08 Thread Erick Erickson
Unless you need the goodies in 4.x, I'd go with 3.1, just on the principle that 4.x is more fluid than 3.1, and I'd go with more static code. 4.x gets whatever patches the committers decide are good whereas 3.1 (or 3.2 if that comes out) will have a smaller set of changes. Both are well tested,

Re: Special characters during indexing and searching

2011-04-08 Thread Erick Erickson
This works fine for me. Tack on debugQuery=on to your URL and post that please unless the stuff below helps But note a couple of things 1 productName isn't part of the default dismax configuration in your solrconfig.xml file, so unless you put it there it's not being searched on. Try putting

Re: Tips for getting unique results?

2011-04-08 Thread Peter Spam
Thanks for the note, Shaun, but the documentation indicates that the sorting is only in ascending order :-( facet.sort This param determines the ordering of the facet field constraints. • count - sort the constraints by count (highest count first) • index - to return the

Re: Trying to Post. Emails rejected as spam.

2011-04-08 Thread Parker Johnson
I have tried to change to plain text format and reword my question several times. Weird and annoying. Here is my question, maybe it'll somehow go through this time: In my master/slave setup, my slaves are polling the master every minute. My indexes are getting large, to the point where I

KStemmer for Solr 3.x +

2011-04-08 Thread Mark
Is there any compatible KStemmer (com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory) or equivalent that works well with 3.1? If not, what would be a decent alternative? Thanks

RE: One item, multiple fields, and range queries

2011-04-08 Thread wojtekpia
Hi Hoss, I realize I'm reviving a really old thread, but I have the same need, and SpanNumericRangeQuery sounds like a good solution for me. Can you give me some guidance on how to implement that? Thanks, Wojtek -- View this message in context:

Re: Special characters during indexing and searching

2011-04-08 Thread alexw
Thanks Erick. Here is the Solr response with debug on. The productName IS in the qf parameter in dismax. I have also pasted my dismax definition and the text field type definition: − 0 47 − on on 0 bit/star dismax 10 2.2 − − bit/star bit/star bit/star −

Re: Lucid Works

2011-04-08 Thread Andrzej Bialecki
On 4/8/11 4:58 PM, Mark wrote: Doesn't look like you allow new members to post questions in that forum. There's a Create new account link there, you simply need to register and log in. I have just one last question ;) We are deciding whether to upgrade our 1.4 production environment to

Re: Lucid Works

2011-04-08 Thread Andy
--- On Fri, 4/8/11, Andrzej Bialecki a...@getopt.org wrote: :) If you don't need the new functionality in 4.x, you don't need the performance improvements, What performance improvements does 4.x have over 3.1? reindexing cycles are long (indexes tend to stay around) then 3.1 is a safer

Re: Special characters during indexing and searching

2011-04-08 Thread Erick Erickson
I'm having real trouble with the formatting. Either Google has changed or somehow all the markup is getting stripped on your end. Could you send as plain text and see if that works? But from what I can make out, we're doing *something* different. Because I get parsed queries like below, and

SOLR-236 (Field Collapsing) patch and 3.1

2011-04-08 Thread Will Milspec
Hi all, We're using the solr-236 (field collapsing) patch on solr 1.4.1 and wish to upgrade to 3.1 Has anyone applied this patch to 3.1, successfully or unsuccessfully? [ftr, Solr 4.x includes field collapsing; 3.1 does not ] The issue has several patch files, including some for 1.4.1

Re: KStemmer for Solr 3.x +

2011-04-08 Thread Smiley, David W.
LucidKStemmer ( LucidGaze) are LGPL licensed -- I just verified this with the NOTICE.txt in the download. I wish Lucid's site was more clear on this -- I checked their first but found no information on the license terms. I don't know why you want an alternative. If you insist I suppose you

Re: Lucid Works

2011-04-08 Thread Andrzej Bialecki
On 4/8/11 9:55 PM, Andy wrote: --- On Fri, 4/8/11, Andrzej Bialeckia...@getopt.org wrote: :) If you don't need the new functionality in 4.x, you don't need the performance improvements, What performance improvements does 4.x have over 3.1? Ah... well, many - take a look at the

ArrayIndexOutOfBoundsException with facet query

2011-04-08 Thread Burton-West, Tom
The query below results in an array out of bounds exception: select/?q=solrversion=2.2start=0rows=0facet=truefacet.field=topicStr Here is the exception: Exception during facet.field of topicStr:java.lang.ArrayIndexOutOfBoundsException: -1931149 at

Re: UIMA example setup w/o OpenCalais

2011-04-08 Thread Jay Luker
Thank you, that worked. For the record, my objection to the OpenCalais service is that their ToS states that they will retain a copy of the metadata submitted by you, and that by submitting data to the service you grant Thomson Reuters a non-exclusive perpetual, sublicensable, royalty-free

Re: KStemmer for Solr 3.x +

2011-04-08 Thread Mark
I only want an alternative if it is not compatible with 3.1. On 4/8/11 1:26 PM, Smiley, David W. wrote: LucidKStemmer ( LucidGaze) are LGPL licensed -- I just verified this with the NOTICE.txt in the download. I wish Lucid's site was more clear on this -- I checked their first but found no

Re: KStemmer for Solr 3.x +

2011-04-08 Thread Mark
And by alternatives I meant if it was not 3.1 compatible what are other stemmers that behave/perform closely. thanks On 4/8/11 1:59 PM, Mark wrote: I only want an alternative if it is not compatible with 3.1. On 4/8/11 1:26 PM, Smiley, David W. wrote: LucidKStemmer ( LucidGaze) are LGPL

Re: Special characters during indexing and searching

2011-04-08 Thread alexw
I am using Nabble to view the thread, and the format seems to be ok: http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=replynode=2796849 1 what version of Solr. Solr 1.4 2 have you looked in your index (admin page and/or luke) to see if what you have indexed there is what you

Re: Special characters during indexing and searching

2011-04-08 Thread alexw
Sorry wrong link to the thread, here is the correct one: http://lucene.472066.n3.nabble.com/Special-characters-during-indexing-and-searching-td2795914.html -- View this message in context: http://lucene.472066.n3.nabble.com/Special-characters-during-indexing-and-searching-tp2795914p2797158.html

Re: Lucid Works

2011-04-08 Thread Mark
How come this new version is bundled with rails and why is there no .war output format? I wanted a simple drop in replacement for my current war :( On 4/8/11 1:27 PM, Andrzej Bialecki wrote: On 4/8/11 9:55 PM, Andy wrote: --- On Fri, 4/8/11, Andrzej Bialeckia...@getopt.org wrote: :) If

Re: Lucid Works

2011-04-08 Thread Erik Hatcher
On Apr 8, 2011, at 17:32 , Mark wrote: How come this new version is bundled with rails and why is there no .war output format? Rails, via JRuby, is used in LucidWorks Enterprise for both the admin and search interfaces. (and also powers the Alerts REST API). I wanted a simple drop in

Re: KStemmer for Solr 3.x +

2011-04-08 Thread David Smiley (@MITRE.org)
I see no reason why it would not be compatible. - Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/KStemmer-for-Solr-3-x-tp2796594p2798213.html Sent from the Solr - User mailing list archive at