Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

2012-09-07 Thread guenter.hip...@unibas.ch
Erick, thanks for response! Our use case is very straight forward and basic. - no cloud infrastructure - XMLUpdateRequest - handler (transformed library bibliographic data which is pushed by the post.jar component). For deletions I used to use the solrJ component until two month ago but because

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread Paul Libbrecht
Erick, I think that should be described differently... You need to set-up protected access for some paths. /update is one of them. And you could make this protected at the jetty level or using Apache proxies and rewrites. Probably /select should be kept open but you need to evaluate if that can

SOLR 4.0 DataImport frozen or fails with WARNING: Unable to read: dataimport.properties?

2012-09-07 Thread deniz
Hi all, I have been trying to index my data from mysql db, but somehow i cant index anything, and dont see any exception / error in logs, except a warning which is highlighted below... Here is my db-config's connection string: dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver

RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH?

2012-09-07 Thread Markus Jelsma
Hi, It should not be so hard but it looks like the current SolrContentHandler builds up the document via SAX-events. You could pass a BoilerpipeContentHandler((ContentHandler)parsingHandler, BoilerpipeExtractor) to the parser in ExtractingDocumentLoader.java. It should work. Markus

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread Tomas Zerolo
On Fri, Sep 07, 2012 at 08:50:58AM +0200, Paul Libbrecht wrote: Erick, I think that should be described differently... You need to set-up protected access for some paths. /update is one of them. And you could make this protected at the jetty level or using Apache proxies and rewrites. So

groups.limit=0 in sharding core results in IllegalArgumentException

2012-09-07 Thread mechravi25
Hi, Im using solr 3.6.1 version. I have kept corex as the common core i.e. I ve used the sharding concept on this core to get the indexed data from all the other cores. Here, If i use grouping with groups.limit=0, its resulting in the following exception numHits must be 0; please use

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
Pivot facets currently only work with individual terms, not ranges. The response you provided does look odd in that there are duplicate timestamps listed, but pivots were only implemented for textual (string being the most common type) fields initially. Erik On Sep 6, 2012, at 19:04 ,

Marco Scalone está ausente de la oficina.

2012-09-07 Thread Marco Scalone
Estaré ausente de la oficina desde el Vie 07/09/2012 y no volveré hasta el Jue 20/09/2012 . Responderé a su mensaje cuando regrese.

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread dan sutton
Hi, If like most people you have application server(s) in front of solr, the simplest and most secure option is to bind solr to a local address (192.168.* or 10.0.0.*). The app server talks to solr via the local (a.k.a blackhole) ip address that no-one from outside can ever access as it's not

RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH?

2012-09-07 Thread Markus Jelsma
It works indeed: https://issues.apache.org/jira/browse/SOLR-3808 -Original message- From:Markus Jelsma markus.jel...@openindex.io Sent: Fri 07-Sep-2012 10:40 To: solr-user@lucene.apache.org Subject: RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH? Hi,

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Pivot facets currently only work with individual terms, not ranges. The response you provided does look odd in that there are duplicate timestamps listed, but pivots were only implemented for textual (string being

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
On Sep 7, 2012, at 08:36 , Dotan Cohen wrote: On Fri, Sep 7, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Pivot facets currently only work with individual terms, not ranges. The response you provided does look odd in that there are duplicate timestamps listed, but pivots

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-07 Thread Alexandre Rafalovitch
Thank you. I can confirm that moving to Beta has made that problem go away. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 4:05 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Ranges won't work at all pivots are purely by individual term currently. If you want to pivot by ranges, and you can define those ranges during indexing, then you could make a field that represented which range

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
On Sep 7, 2012, at 09:29 , Dotan Cohen wrote: On Fri, Sep 7, 2012 at 4:05 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Ranges won't work at all pivots are purely by individual term currently. If you want to pivot by ranges, and you can define those ranges during indexing, then you

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Yonik Seeley
On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher erik.hatc...@gmail.com wrote: A trie field probably doesn't work properly, as it indexes multiple terms per value and you'd get odd values. I don't know about pivot faceting, but all of the other types of faceting take this into account (hence

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 4:39 PM, Erik Hatcher erik.hatc...@gmail.com wrote: Just to be clear, as I'm not logged onto the dev server at the moment but it was implied in an earlier mail: Any field that is to be pivoted on needs to be a string field? Is that documented, as I cannot find that in

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 5:04 PM, Yonik Seeley yo...@lucidworks.com wrote: On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher erik.hatc...@gmail.com wrote: A trie field probably doesn't work properly, as it indexes multiple terms per value and you'd get odd values. I don't know about pivot faceting,

Re: groups.limit=0 in sharding core results in IllegalArgumentException

2012-09-07 Thread yriveiro
Hi, I have the same issue using solr 4.0-ALPHA. -- View this message in context: http://lucene.472066.n3.nabble.com/groups-limit-0-in-sharding-core-results-in-IllegalArgumentException-tp4006086p4006110.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

2012-09-07 Thread Erick Erickson
Thank the guys who actually fixed it! Thanks for bringing this up, and please let us know if Yonik's patch fixes your problem Best Erick On Thu, Sep 6, 2012 at 11:39 PM, guenter.hip...@unibas.ch guenter.hip...@unibas.ch wrote: Erick, thanks for response! Our use case is very straight

Access and copy lucene index data

2012-09-07 Thread Bill_78
Dear all, Similar subjects about index data have already been post, but I would like your advise. I use solr analysers to process fields, like synonyms, stopwords, ... and I cannot see the result without using a special tool (like LukeRequestHandler for example). I would like to copy the index

Re: Website (crawler for) indexing

2012-09-07 Thread Dominique Bejean
May be you can take a look at Crawl-Anywhere which have administration web interface, solr indexer and search web application. www.crawl-anywhere.com Regards. Dominique Le 05/09/12 17:05, Lochschmied, Alexander a écrit : This may be a bit off topic: How do you index an existing website and

Re: SOLR 4.0 DataImport frozen or fails with WARNING: Unable to read: dataimport.properties?

2012-09-07 Thread Travis Low
Change your data-config.xml connection XML to this: dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://dbhost:3396/myDB user=XXX password=XXX batchSize=-1 / Then try again. This keeps the driver from trying to fetch the entire result set at the same time. cheers,

Re: Access and copy lucene index data

2012-09-07 Thread Jack Krupansky
You can use the Solr admin analysis web page to enter a term or even a passage of text and see how it would be analyzed/indexed for any specified field or field type. -- Jack Krupansky -Original Message- From: Bill_78 Sent: Friday, September 07, 2012 11:23 AM To:

Indexing CSV files with filenames

2012-09-07 Thread edvicif
Hi! I've have a set of CSV files. I wanted to index them by certain columns. But I also want to store the filename, where they got indexed from. The reason is, that the queries I want to run is to identify files. David -- View this message in context:

Re: Indexing CSV files with filenames

2012-09-07 Thread Rafał Kuć
Hello! In Solr 4.0 you will have the ability to add arbitrary field along with all documents from a single file - http://wiki.apache.org/solr/UpdateCSV#literal -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi! I've have a set of CSV

Re: solrcloud setup using tomcat, single machine

2012-09-07 Thread Mark Miller
Environment name=solr/home type=java.lang.String value=/usr/solr/data/conf override=true / The above does not look right - you probably would want /usr/solr/example/solr for your solrhome based on other info you give. You also reference /usr/solr/data/conf as your conf folder, but I'd

Re: Indexing CSV files with filenames

2012-09-07 Thread edvicif
Thx for the quick answer. Can you help a little more? I don't really got the concept of literal. How can I set a field with the source absolute path? I mean how can I find out the parameter names? An example will be really help full. -- View this message in context:

Re: Indexing CSV files with filenames

2012-09-07 Thread Rafał Kuć
Hello! You can just pass the name of the file to the 'literal' parameter. For example adding literal.filename=my_file.csv would set the 'filename' field of your document with the value of 'my_file.csv'. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -

Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Hello all, Due to multiple languages and dirty OCR, our indexes have over 2 billion unique terms ( http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again). In Solr 3.6 and previous we needed to reduce the memory used for storing the in-memory representation of the tii file. We

Schema model to store additional field metadata

2012-09-07 Thread sysrq
Hi, I want to create a Solr index of articles. Each article should have a title, content, published date and an arbitrary number of images attached to. An article could look like this: title: An article about Foo and Bar content: This is some text about Foo and Bar. published:

Re: Indexing CSV files with filenames

2012-09-07 Thread edvicif
My problem is more like the left hand side of the equatation. Is it ${f.name} or something? On Sep 7, 2012 5:36 PM, Rafał Kuć-3 [via Lucene] ml-node+s472066n4006179...@n3.nabble.com wrote: Hello! You can just pass the name of the file to the 'literal' parameter. For example adding

Re: Schema model to store additional field metadata

2012-09-07 Thread Alexandre Rafalovitch
Why would you store the actual images in SOLR? There is no way to really search the bytes of image, is there? What you probably want to do is extract all searchable metadata out of that image, name, alt, EXIF, etc. And you are most likely looking at dynamic fields as the solution 1) Define

Re: Version Migration from solr 1.3

2012-09-07 Thread Sujatha Arun
I see that 4.0 alpha has been release after 3.6.1 , so should I look at 3.5 as the most stable release currently? Version Source : https://issues.apache.org/jira/browse/SOLR?selectedTab=com.atlassian.jira.plugin.system.project%3Aversions-panel Regards Sujatha On Fri, Sep 7, 2012 at 11:17 PM,

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Thanks Robert, I'll have to spend some time understanding the default codec for Solr 4.0. Did I miss something in the changes file? I'll be digging into the default codec docs and testing sometime in next week or two (with a 2 billion term index) If I understand it well enough, I'll be happy

Re: Version Migration from solr 1.3

2012-09-07 Thread Mani
If you have time, you might as well wait for 4.0 to be released otherwise 3.6.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Version-Migration-from-solr-1-3-tp4006193p4006200.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr 4: Private master, public slave?

2012-09-07 Thread Alexandre Rafalovitch
Hello, I have a bunch of documents that I would like to index on a local server behind the firewall. But then, the actual search will happen on a public infrastructure (Amazon, etc). The documents themselves are not quite public, so I want just the index content (indexed, not stored) being

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Robert Muir
On Fri, Sep 7, 2012 at 2:19 PM, Tom Burton-West tburt...@umich.edu wrote: Thanks Robert, I'll have to spend some time understanding the default codec for Solr 4.0. Did I miss something in the changes file? http://lucene.apache.org/core/4_0_0-BETA/ see the file formats section, especially

Re: Re: Schema model to store additional field metadata

2012-09-07 Thread sysrq
Why would you store the actual images in SOLR? No, the images are files on the filesystem. Only the path to the image should be stored in Solr. And you are most likely looking at dynamic fields as the solution 1) Define *_Path, *_Size, *_Alt as a dynamic field with appropriate types 2)

[Solr4 beta] error 503 on commit

2012-09-07 Thread Antoine LE FLOC'H
Hello, Using package org.apache.solr.client.solrj; when I do: UpdateResponse ur = solrServer.commit(false, false); I get sometimes (not often): SolrException e where e.code() == SolrException.ErrorCode.SERVICE_UNAVAILABLE.code When I catch this exception, I try to commit again, the call

Re: How to preserve source column names in multivalue catch all field

2012-09-07 Thread Kiran Jayakumar
Thank you Erick. I think #2 is the best for me because I have more than hundred fields dont want to construct a huge query each time. On Thu, Sep 6, 2012 at 9:38 PM, Erick Erickson erickerick...@gmail.comwrote: Try using edismax to distribute the search across the fields rather than using the

Re: Problem with verifying signature ?

2012-09-07 Thread Kiran Jayakumar
Thank you. On Thu, Sep 6, 2012 at 9:51 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : gpg: Signature made 08/06/12 19:52:21 Pacific Daylight Time using RSA key : ID 322 : D7ECA : gpg: Good signature from Robert Muir (Code Signing Key) rm...@apache.org : *gpg: WARNING: This key is

Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-07 Thread Kiran Jayakumar
Do you have the unique key set up in your schema.xml ? It should be automatic if you have the ID field and define it as the unique key. uniqueKeyID/uniqueKey On Thu, Sep 6, 2012 at 11:50 AM, Mani mehamba...@art.com wrote: I have a made a schema change to copy an existing field name (Source

Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-07 Thread Mani
yes..I do have this uniquekey defined properly. uniqueKeyid/uniqueKey Before the schema change... copyField source=itemSKU dest=text/ copyField source=itemCategories dest=text/ After the schema change... copyField source=itemName dest=text/ copyField source=itemCategories dest=text/ copyField

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-07 Thread Chris Hostetter
: When i index a text field which has arabic and English like this tweet : “@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟” : #gcc #ksa #lebanon #syria #kuwait #egypt #سوريا : with field_type as 'text_ar' and when i try to see the same field again in : solr, it is

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Thanks Robert, if not, just customize blocktree's params with a CodecFactory in solr, or even pick another implementation (FixedGap, VariableGap, whatever). Still trying to get my head around 4.0 and flexible indexing. I'll take another look at Mike's and your presentations. I'm trying to

Re: [Solr4 beta] error 503 on commit

2012-09-07 Thread Chris Hostetter
: I get sometimes (not often): : SolrException e where e.code() == : SolrException.ErrorCode.SERVICE_UNAVAILABLE.code Are there any errors in your solr server logs? Are you using the DistributedUpdateProcessor (ie: SolrCloud) ? There aren't many places in Solr that will throw a 503 status

Why is using edismax in Admin UI puts edismax=true but not defType=edismax?

2012-09-07 Thread Alexandre Rafalovitch
Hello, I am not edismax=true as a flag actually does anything (Solr4 beta): 'responseHeader'={ 'status'=0, 'QTime'=1, 'params'={ 'debugQuery'='true', 'indent'='true', 'edismax'='true', 'q'='text', 'qf'='TitleEN DescEN', 'wt'='ruby',

Re: Why is using edismax in Admin UI puts edismax=true but not defType=edismax?

2012-09-07 Thread Chris Hostetter
: I am not edismax=true as a flag actually does anything (Solr4 beta): Alexandre: You are 100% correct, this appears to be a bug in the Admin UI. Thank you for reporting it... https://issues.apache.org/jira/browse/SOLR-3811 -Hoss

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-07 Thread Shawn Heisey
On 9/6/2012 6:54 PM, kiran chitturi wrote: The error i am getting is 'org.apache.solr.common.SolrException: Invalid Date String: '1345743552'. I think it was being saved as a string in DB, so i will use the DateFormatTransformer. To go along with all the other replies that you have gotten:

RE: [Solr4 beta] error 503 on commit

2012-09-07 Thread Markus Jelsma
Hi, We've seen this too on one of the test nodes yesterday, it ran on a build of a few days old. The node receiving documents complained it could not forward them to the fifth node and returned a 503. The fifth node itself only logged a NPE and the 503, nothing more, no stack traces. There

Re: N-gram ranking based on term position

2012-09-07 Thread Amit Nithian
I think your thought about using the edge ngram as a field and boosting that field in the qf/pf sections of the dismax handler sounds reasonable. Why do you have qualms about it? On Fri, Sep 7, 2012 at 12:28 PM, Kiran Jayakumar kiranjuni...@gmail.com wrote: Hi, Is it possible to score

Re: N-gram ranking based on term position

2012-09-07 Thread Kiran Jayakumar
Since Edge N-gram tokens are a subset of N-gram tokens, I was wondering if I could be a bit more space efficient. On Fri, Sep 7, 2012 at 3:07 PM, Amit Nithian anith...@gmail.com wrote: I think your thought about using the edge ngram as a field and boosting that field in the qf/pf sections of

Term searches with colon(:)

2012-09-07 Thread Nemani, Raj
All, I was wondering if anybody has run into this issue before. Solr is not returing any search results for word that contain colon ( : ) in it when we perform a term search containing colon. We do escape this correctly, I believe as shown in the sample (taken from tomcat logs) Sep 06, 2012

Re: Term searches with colon(:)

2012-09-07 Thread Chris Hostetter
: I was wondering if anybody has run into this issue before. Solr is not : returing any search results for word that contain colon ( : ) in it : when we perform a term search containing colon. We do escape this : correctly, I believe as shown in the sample (taken from tomcat logs) ... :