Re: Schema Change: Int - String (i am the original poster, new email address)

2013-06-07 Thread z z
Maybe if I were to say that the column user_id will become user_ids that would clarify things? user_id:2002+AND+created:[${**from}+TO+${until}]+data:more becomes user_id*s*:2002+AND+created:[${**from}+TO+${until}]+data:more where I want 2002 to be an exact positive match on one of the user_ids

Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-07 Thread Bernd Fehling
Hi Shawn, I also had CMS with tons of tuning options but still had once in a while bigger GC pause. After switching to JDK7 I tried G1GC with no other options and it runs perfekt. With CMS I saw that old and young generation where growing until they had to do a GC. This produces the sawtooth and

Re: [blogpost] Memory is overrated, use SSDs

2013-06-07 Thread Toke Eskildsen
On Fri, 2013-06-07 at 07:15 +0200, Andy wrote: One question I have is did you precondition the SSD ( http://www.sandforce.com/userfiles/file/downloads/FMS2009_F2A_Smith.pdf )? SSD performance tends to take a very deep dive once all blocks are written at least once and the garbage collector

Re: nutch 1.4, solr 3.4 configuration error

2013-06-07 Thread Tuğcem Oral
I had a similar error. I couldn't find any documentation which nutch and solr versions are compatible. For instance, we' re using nutch 1.6 on hadoop 1.0.4 with solrj 3.4.0 and index crawled segments to solr 4.2.0. But I remember that I could find a compatible version of solrj for nutch 1.4

Clear cache used by Solr

2013-06-07 Thread Varsha Rani
Hi I 'm trying to compare the performance of different Solr queries. In order to get a fair test, I want to clear the cache between queries. How is this done? Of course, one can restart the server, I was to know if there is a quicker way. -- View this message in context:

solr.NoOpDistributingUpdateProcessorFactory in SOLR CLOUD

2013-06-07 Thread sathish_ix
Hi , Need more information how NoOpDistributingUpdateProcessorFactory works, Below is the cloud setup, collection1 shard1 ---node1:8983 (leader) | | _ _ _ _ _ _ _ _ _ _ node2:8984 | |_ _ _ _ _ _ _ _ _ _ _ _ shard2--- node3:7585

Re: Configuring seperate db-data-config.xml per shard

2013-06-07 Thread sathish_ix
Hi, we were able to accomplish this by single collection. Zookeeper : create separate node for each shards, and upload the dbconfig file under shards. eg : /config/config1/shard1 /config/config1/shard2 /config/config1/shard3 In the solrconfig.xml, requestHandler name=/dataimport

Re: Is there a way to load multiple schema when using zookeeper?

2013-06-07 Thread sathish_ix
Hi, we were able to accomplish this by single collection. Zookeeper : create separate node for each shards, and upload the dbconfig file under shards. eg : /config/config1/shard1 /config/config1/shard2 /config/config1/shard3 In the solrconfig.xml, requestHandler name=/dataimport

Re: Is there a way to load multiple schema when using zookeeper?

2013-06-07 Thread sathish_ix
Hi, we were able to accomplish this by single collection. Zookeeper : create separate node for each shards, and upload the dbconfig file under shards. eg : /config/config1/shard1 /config/config1/shard2 /config/config1/shard3 In the solrconfig.xml, requestHandler name=/dataimport

Re: Clear cache used by Solr

2013-06-07 Thread Toke Eskildsen
On Fri, 2013-06-07 at 09:24 +0200, Varsha Rani wrote: I 'm trying to compare the performance of different Solr queries. In order to get a fair test, I want to clear the cache between queries. How is this done? Of course, one can restart the server, I was to know if there is a quicker way.

Re: LotsOfCores feature

2013-06-07 Thread Aleksey
A use case would a web site or service that had millions of users, each of whom would have an active Solr core when they are active, but inactive otherwise. Of course those cores would not all reside on one node and ZooKeeper is out of the question for managing anything that is in the

Using Solr Scripts

2013-06-07 Thread Furkan KAMACI
I have a SolrCloud and I want to maintain some important things on it. i.e. I will backup indexes, start - stop Solr nodes individually, send an optimize request to the cloud etc. However I see that there is a scripts folder comes with Solr. Can I use some of them for my purposes or should I

How to stop index distribution among shards in solr cloud

2013-06-07 Thread sathish_ix
Hi, I have two shards, logically each shards corresponds to a region. Currently index is distributed in solr cloud to shards, how to load index to specific shard in solr cloud, Any thoughts ? Thanks, Sathish -- View this message in context:

Solr4.3 Internationalization.

2013-06-07 Thread bsargurunathan
Guys, Please clarify the following questions regarding Solr Internationalization. 1) Initially my requirement is need to support 2 languages(English French) for a Web application. And we are using Mysql DB. 2) So please share good and easy approach to achieve it with some sample configs. 3)

Re: LotsOfCores feature

2013-06-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
The Wiki page was built not for Cloud Solr. We have done such a deployment where less than a tenth of cores were active at any given point in time. though there were tens of million indices they were split among a large no:of hosts. If you don't insist of Cloud deployment it is possible. I'm

Re: SOLR CSV output in custom order

2013-06-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
Have you tried explicitly giving the field names (fl) as parameter http://wiki.apache.org/solr/CommonQueryParameters#fl On Thu, Jun 6, 2013 at 12:41 PM, anurag.jain anurag.k...@gmail.com wrote: I want output of csv file in proper order. when I use wt=csv it gives output in random order. Is

Re: [blogpost] Memory is overrated, use SSDs

2013-06-07 Thread Erick Erickson
Thanks for this, hard data is always welcome! Another blog post for my reference list! Erick On Fri, Jun 7, 2013 at 2:59 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: On Fri, 2013-06-07 at 07:15 +0200, Andy wrote: One question I have is did you precondition the SSD (

Re: solr.NoOpDistributingUpdateProcessorFactory in SOLR CLOUD

2013-06-07 Thread Erick Erickson
I don't think you want the noop bits, I'd go back to the standard definitions here. What you _do_ want, I think, is the custom hashing option, see: https://issues.apache.org/jira/browse/SOLR-2592 which has been in place since Solr 4.1. It allows you to send documents to the shard of your choice,

Re: Clear cache used by Solr

2013-06-07 Thread Erick Erickson
I really question whether this is valuable. Much of Solr performance is there explicitly because of caches, so what you're measuring is disk I/O to fill caches and any other latency. I'm just not sure what operational information you'll get here. But assuming that you're really getting actionable

solr facet query on multiple search term

2013-06-07 Thread vrparekh
Hello All, I required facet counts for multiple SearchTerms. Currently I am doing two separate facet query on each search term with facet.range=dateField e.g. http://solrserver/select?q=1stsearchTermfq=onfacet-parameters

Re: LotsOfCores feature

2013-06-07 Thread Erick Erickson
I should have been clearer, and others have mentioned... the lots of cores stuff is really outside Zookeeper/SolrCloud at present. I don't think it's incompatible, but it wasn't part of the design so it'll need some effort to make it play nice with SolrCloud. I'm not sure there's actually a

Documents

2013-06-07 Thread acasaus
Good morning, I would like to know how I can modify a xml file to access to my information and not to the example information because I have one file from I obtains the information that I use to show the user with Blacklight. Sorry about my english, Alex

Re: Documents

2013-06-07 Thread Dmitry Kan
hi, you need to parse your custom xml file and transform it into the xml file that will be of format solr understands. If you are familiar with xslt, you could do that in a few lines depending on the complexity of the input xml file. Dmitry On Fri, Jun 7, 2013 at 3:34 PM, acas...@greendata.com

Re: Doubt Regarding Shards Index

2013-06-07 Thread sathish_ix
Hi , How did you distribute the index by year to different shards, do we need to write any code ? Thanks, Sathish -- View this message in context: http://lucene.472066.n3.nabble.com/Doubt-Regarding-Shards-Index-tp3629964p4068869.html Sent from the Solr - User mailing list archive at

[CROSS-POSTING] SOLR-4903 and SOLR-4904

2013-06-07 Thread Dmitry Kan
CROSS-POSTING from dev list. Hi guys, As discussed with Grant and Andrzej I have created two jiras related to inefficiency in distributed faceting. This affects 3.4, but my gut feeling is telling me 4.x is affected as well. Regards, Dmitry Kan P.S. Asking this question won yours truly second

Re: HdfsDirectoryFactory

2013-06-07 Thread Mark Miller
Eagle eye man. Yeah, we plan on contributing hdfs support for Solr. I'm flying home today and will create a JIRA issue for it shortly after I get there. - Mark On Jun 6, 2013, at 6:16 PM, Jamie Johnson jej2...@gmail.com wrote: I've seen reference to an HdfsDirectoryFactory in the new

Re: Doubt Regarding Shards Index

2013-06-07 Thread Dmitry Kan
Hi, Sharding by time by itself does not need any custom code on solr side: start indexing your data to a shard, depending on the timestamp of your document. The querying part is trickier if you want to have one front end solr: it should know which shards to query. If querying all shards for each

Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-07 Thread Otis Gospodnetic
This is exactly what we did for a clients (alas using Elasticsearch). We then observed better performance through SPM. We used the latest Oracle JVM. Otis Solr ElasticSearch Support http://sematext.com/ On Jun 7, 2013 2:55 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: Hi Shawn, I

Re: Clear cache used by Solr

2013-06-07 Thread Yonik Seeley
On Fri, Jun 7, 2013 at 7:32 AM, Erick Erickson erickerick...@gmail.com wrote: I really question whether this is valuable. Much of Solr performance is there explicitly because of caches Right, and it's also the case that certain solr features are coded with the cache in mind (i.e. they will be

Re: LotsOfCores feature

2013-06-07 Thread Jack Krupansky
AFAICT, SolrCloud addresses the use case of distributed update for a relatively smaller number of collections (dozens?) that have a relatively larger number of rows - billions over a modest to moderate number of nodes (a handful to a dozen or dozens). So, maybe dozens of collections (some

Re: Schema Change: Int - String (i am the original poster, new email address)

2013-06-07 Thread Jack Krupansky
Right, a search for 442 would not match 1442. -- Jack Krupansky -Original Message- From: z z Sent: Friday, June 07, 2013 2:18 AM To: solr-user@lucene.apache.org Subject: Re: Schema Change: Int - String (i am the original poster, new email address) Maybe if I were to say that the

Re: OR query with null value and non-null value(s)

2013-06-07 Thread Jack Krupansky
Yes, it SHOULD! And in the LucidWorks Search query parser it does. Why doesn't it in Solr? Ask Yonik to explain that! -- Jack Krupansky -Original Message- From: Rahul R Sent: Friday, June 07, 2013 1:21 AM To: solr-user@lucene.apache.org Subject: Re: OR query with null value and

Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-07 Thread adityab
Hi All, I work with Sandeep M, so continued to his comments. We did observe a memory growth. We use jdk1.6.0_45 with CMS. We see this issue because of large document size. With large i mean our single document has large multivalued fields. We found that JIRA LUCENE-4995

Re: Documents

2013-06-07 Thread Alexandre Rafalovitch
If you are trying to import an external XML file into your system, you may want to look at DataImportHandler. It is a good way to start. Look at Wikipedia examples. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is

Re: Solr4.3 Internationalization.

2013-06-07 Thread Alexandre Rafalovitch
It may be helpful to approach this from the other side. Specifically search. Are you: 1) Expecting to search across both French and English content (e.g. French, but fallback to English if translation is missing)? If yes, you want a single collection 2) Is French content completely separate from

Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-07 Thread Mark Wilson
Hi I am having an issue with adding pdf documents to a SolrCloud index I have setup. I can index pdf documents fine using 4.3.0 on my local box, but I have a SolrCloud instance setup on the Amazon Cloud (Using 2 servers) and I get Error. It seems that it is not loading

Custom Data Clustering

2013-06-07 Thread Raheel Hasan
Hi, Can someone please tell me if there is a way to have a custom *`clustering of the data`* from `solr` 'query' results? I am facing 2 issues currently: 1. The `*Carrot*` clustering only applies clustering to the paged results (i.e. in the current pagination's page results). 2. I need to

RE: How to stop index distribution among shards in solr cloud

2013-06-07 Thread James Thomas
This may help: http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud --- See Document Routing section. -Original Message- From: sathish_ix [mailto:skandhasw...@inautix.co.in] Sent: Friday, June 07, 2013 5:27 AM To: solr-user@lucene.apache.org Subject: How to

Re: solr.NoOpDistributingUpdateProcessorFactory in SOLR CLOUD

2013-06-07 Thread Chris Hostetter
: I don't think you want the noop bits, I'd go back to the : standard definitions here. Correct. the NoOpDistributingUpdateProcessorFactory is for telling the update processor chain that you do not want it to do any distribution of updates at all -- whatever SolrCore you send the doc to, is

Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-07 Thread Michael Della Bitta
Hi Mark, This is a total shot in the dark, but does passing -Djava.awt.headless=true when you run the server help at all? More on awt headless mode: http://www.oracle.com/technetwork/articles/javase/headless-136834.html Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1

Re: OR query with null value and non-null value(s)

2013-06-07 Thread Rahul R
Thank you for the Clarification Shawn. On Fri, Jun 7, 2013 at 7:34 PM, Jack Krupansky j...@basetechnology.comwrote: Yes, it SHOULD! And in the LucidWorks Search query parser it does. Why doesn't it in Solr? Ask Yonik to explain that! -- Jack Krupansky -Original Message- From:

Re: LotsOfCores feature

2013-06-07 Thread Aleksey
Aleksey: What would you say is the average core size for your use case - thousands or millions of rows? And how sharded would each of your collections be, if at all? Average core/collection size wouldn't even be thousands, hundreds more like. And the largest would be half a million or so but

Re: LotsOfCores feature

2013-06-07 Thread Jack Krupansky
Thanks. That's what I suspected. Yes, MegaMiniCores. My scenario is purely hypothetical. But it is also relevant for multi-tenant use cases, where the users and schemas are not known in advance and are only online intermittently. Users could fit three rough size categories: very small,

RE: SolrCloud Load Balancer weight

2013-06-07 Thread Vaillancourt, Tim
Cool! Having those values influenced by stats is a neat idea too. I'll get on that soon. Tim -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, June 03, 2013 5:07 PM To: solr-user@lucene.apache.org Subject: Re: SolrCloud Load Balancer weight On Jun 3,

translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello all, environment: solr 3.5, centos problem statement: i have several character codes that i want to translate to ordinal (integer) values (for sorting), while retaining the original code field in the document. i was thinking that i could use a copyField from my code field to my ord field

Re: solr facet query on multiple search term

2013-06-07 Thread Erick Erickson
I'm a little confused here. Faceting is about counting docs that meet your query restrictions. I.e. the q= and fq= clauses. So your original problem statement simply cannot be combined into a single query since your q= clauses are different. You could do something like q=(firstterm OR

Re: translating a character code to an ordinal?

2013-06-07 Thread Jack Krupansky
This won't help you unless you move to Solr 4.0, but here's an update processor script from the book that can take the first character of a string field and add it as an integer value for another field: updateRequestProcessorChain name=script-add-char-code processor

Re: Filtering on results with more than N words.

2013-06-07 Thread Jack Krupansky
Also from the book, here's an alternative update request processor that uses a JavaScript script to do the counting and field creation: updateRequestProcessorChain name=script-add-word-count processor class=solr.StatelessScriptUpdateProcessorFactory str name=scriptadd-word-count.js/str

Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello jack, thank you for the code ;) what book are you referring to? AFAICT - all of the 4.0 books are future order. we won't be moving to 4.0 (soon enough). so i take it - copyfield will not work, eg - i cannot take a code like ABC and copy it to an int field and then use the regex to turn

Re: translating a character code to an ordinal?

2013-06-07 Thread Jack Krupansky
Correct, you need either an update request processor, a custom field type, or to preprocess your input before you give it to Solr. You can't do analysis on a non-text field. The book is my new Solr reference/guide that I will be self-publishing. We hope to make an Alpha draft available later

Re: Lucene/Solr Filesystem tunings

2013-06-07 Thread Tim Vaillancourt
I figured as much for atime, thanks Otis! I haven't ran benchmarks just yet, but I'll be sure to share whatever I find. I plan to try ext4 vs xfs. I am also curious what effect disabling journaling (ext2) would have, relying on SolrCloud to manage 'consistency' over many instances vs FS

Re: Two instances of solr - the same datadir?

2013-06-07 Thread Tim Vaillancourt
If it makes you feel better, I also considered this approach when I was in the same situation with a separate indexer and searcher on one Physical linux machine. My main concern was re-using the FS cache between both instances - If I replicated to myself there would be two independent copies of

Re: Two instances of solr - the same datadir?

2013-06-07 Thread Roman Chyla
I have auto commit after 40k RECs/1800secs. But I only tested with manual commit, but I don't see why it should work differently. Roman On 7 Jun 2013 20:52, Tim Vaillancourt t...@elementspace.com wrote: If it makes you feel better, I also considered this approach when I was in the same

Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
thx, please send me a link to the book so i get/purchase it. thx mark -- View this message in context: http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068997.html Sent from the Solr - User mailing list archive at Nabble.com.

custom field tutorial

2013-06-07 Thread geeky2
can someone point me to a custom field tutorial. i checked the wiki and this list - but still a little hazy on how i would do this. essentially - when the user issues a query, i want my class to interrogate a string field (containing several codes - example boo, baz, bar) and return a single

Re: LotsOfCores feature

2013-06-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
We set it up like this + individual solr instances are setup + external mapping/routing to allocate users to instances. This information can be stored in an external data store + all cores are created as transient and loadonstart as false + cores come online on demand + as and when users data get

Re: custom field tutorial

2013-06-07 Thread Walter Underwood
What are you trying to do? This seems really odd. I've been working in search for fifteen years and I've never heard this request. You could always return all the fields to the client and ignore the ones you don't want. wunder On Jun 7, 2013, at 8:24 PM, geeky2 wrote: can someone point me

Re: Multitable import - uniqueKey

2013-06-07 Thread sodoo
Thank you for all reply members. Solve the issue. -- View this message in context: http://lucene.472066.n3.nabble.com/Multitable-import-uniqueKey-tp4067796p4069007.html Sent from the Solr - User mailing list archive at Nabble.com.