Re: How to re-index Solr get term frequency within documents

2013-07-02 Thread Tony Mullins
I use Nutch as input datasource for my Solr. So I cannot re-run all the Nutch jobs to generate data again for Solr as it will take very long to generate that much data. I was hoping there would be an easier way inside Solr to just re-index all the existing data. Thanks, Tony On Tue, Jul 2,

Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Can you please suggest a way (with example) of assigning this unique key to a pdf file? -- View this message in context: http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074588.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Okay. Can you please suggest a way (with an example) of assigning this unique key to a pdf file. Say, a unique number to each pdf file. How do i achieve this? -- View this message in context: http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074592.html

Re: Solr indexer and Hadoop

2013-07-02 Thread engy.morsy
Michael, I understand from your post that I can use the current storage without in Hadoop. I already have the storage mounted via NFS. Does your map function read from the mounted storage directly? If possible can you please illustrate more on that. Thanks Engy -- View this message in

Solr - Delta Query Via Full Import

2013-07-02 Thread Mysurf Mail
I am using DIH to fetch rows from db to solr. I have many 1:n relations and I can do it only if I use caching (super fast) Therefor I am adding the following attributes to my inner entity processor=CachedSqlEntityProcessor cacheKey= cacheLookup= Everything works great and fast. (First the n

Re: Unique key error while indexing pdf files

2013-07-02 Thread Shalin Shekhar Mangar
We can't tell you what the id of your own document should be. Isn't there anything which is unique about your pdf files? How about the file name or the absolute path? On Tue, Jul 2, 2013 at 11:33 AM, archit2112 archit2...@gmail.com wrote: Okay. Can you please suggest a way (with an example) of

Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Yes. The absolute path is unique. -- View this message in context: http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074620.html Sent from the Solr - User mailing list archive at Nabble.com.

Removal of unique key - Query Elevation Component

2013-07-02 Thread archit2112
I want to index pdf files in solr 4.3.0 using the data import handler. I have done the following: My request handler - requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst

Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread Shalin Shekhar Mangar
My guess is that you have a copyField element which copies the author into an author_s field. On Tue, Jul 2, 2013 at 2:14 PM, archit2112 archit2...@gmail.com wrote: I want to index pdf files in solr 4.3.0 using the data import handler. I have done the following: My request handler -

Re: Solr indexer and Hadoop

2013-07-02 Thread Anatoli Matuskova
If you can upload your data to hdfs you can use this patch to build the solr indexes: https://issues.apache.org/jira/browse/SOLR-1301 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4074635.html Sent from the Solr - User mailing list

Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread archit2112
Thanks! The author_s issue has been resolved. Why are the other fields not getting indexed ? -- View this message in context: http://lucene.472066.n3.nabble.com/Removal-of-unique-key-Query-Elevation-Component-tp4074624p4074636.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Yes. The absolute path is unique. How do i implement it? can you please explain? -- View this message in context: http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074638.html Sent from the Solr - User mailing list archive at Nabble.com.

need distance in miles not in kilometers

2013-07-02 Thread irshad siddiqui
Hi, I am suing solr 4.2 and my results are coming proper. but now i want to distance in miles and i am getting the distance in kilometre. can anyone tell me how to get the distance in miles. example query

Re: OOM killer script woes

2013-07-02 Thread Daniel Collins
On looking at the code in SolrDispatchFilter, is this intentional or not? I think I remember Mark Miller mentioning that in an OOM case, the best course of action is basically to kill the process, there is very little Solr can do once it has run out of memory. Yet it seems that Solr catches the

Aggregate TermFrequency on Result Grouping / Field Collapsing

2013-07-02 Thread Tony Mullins
Hi, Is it possible to perform aggregated termfreq(field,term) on Result Grouping ? I am trying to get total count of term's appearance in a document and then want to aggregate that count by grouping the document on one of my field. Like this

undefined field http:// while searchi query

2013-07-02 Thread aniljayanti
Hi, I am using solr 3.3 version. After indexing I am querying below command. http://localhost:8080/solr/select/?q=*(http://www.google.co.in)* getting below error. SEVERE: org.apache.solr.common.SolrException: *undefined field http://* at

Solr 4.3 Pivot Performance Issue

2013-07-02 Thread solrUserJM
Hi There, I notice with the upgrade from solr 4.0 to solr 4.3 that we had a degradation of queries that are using pivot fields. Have someone else notice it too? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-3-Pivot-Performance-Issue-tp4074617.html Sent

Re: No date.gap on pivoted facets

2013-07-02 Thread Dotan Cohen
On Sun, Jun 30, 2013 at 5:33 PM, Jack Krupansky j...@basetechnology.com wrote: Sorry, but Solr pivot faceting is based solely on field facets, not range (or date) facets. Thank you. I tried adding that information to the SimpleFacetParameters wiki page, but that page seems to be defined as

Spell check in SOLR

2013-07-02 Thread Prathik Puthran
Hi, How can i configure SOLR to provide corrections for misspelled words. If the query string is in dictionary SOLR should not return any suggestions. But if the query string is not in dictionary SOLR should return all possible corrected words in the dictionary which most likely could be the

RE: undefined field http:// while searchi query

2013-07-02 Thread Markus Jelsma
colons need to be escaped cheers -Original message- From:aniljayanti aniljaya...@yahoo.co.in Sent: Tuesday 2nd July 2013 12:35 To: solr-user@lucene.apache.org Subject: undefined field http:// while searchi query Hi, I am using solr 3.3 version. After indexing I am querying

parent Import Query doent run

2013-07-02 Thread Mysurf Mail
I have 1:n relation between my main entity(PackageVersion) and its tag in my DB. I add a new tag with this date to the db at the timestamp and I run delta import command. the select retrieves the line but i dont see any other sql. Here are my data-config.xml configurations: entity

Re: undefined field http:// while searchi query

2013-07-02 Thread Daniel Collins
Presuming that uses the standard lucene query parser syntax then you have asked to query for the field called http, searching for the value // www.google.co.in See http://wiki.apache.org/solr/SolrQuerySyntax for more details, but you probably want to escape the : at least, http\://www.google.co.in

Re: Stemming query in Solr

2013-07-02 Thread Erick Erickson
Somehow we're mis-communicating here. Forget expansion, it's all about base forms. G. bq: What I cannot figure out is how is this going to help me in instructing Solr to execute the query for the different grammatical variations of the input search term stem You don't. You search the stemmed

Re: documentCache not used in 4.3.1?

2013-07-02 Thread Erick Erickson
This takes some significant custom code, but... One strategy is to keep your commits relatively lengthy (depends on the ingest rate) and keep a side car index either a small core or a RAMDirectory. Then at search time you somehow combine the two results. The somehow is a bit tricky since the

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
As you see it, does SOLR-3076 fixes my problem? Is SOLR-3076 fix getting into solr 4.4? Mikhail Khludnev wrote On Mon, Jul 1, 2013 at 5:56 PM, adfel70 lt; adfel70@ gt; wrote: This requires me to override the solr document distribution mechanism. I fear that with this solution I may

Re: Solr 4.3 Pivot Performance Issue

2013-07-02 Thread Jack Krupansky
What is the nature of your degradation? -- Jack Krupansky -Original Message- From: solrUserJM Sent: Tuesday, July 02, 2013 4:22 AM To: solr-user@lucene.apache.org Subject: Solr 4.3 Pivot Performance Issue Hi There, I notice with the upgrade from solr 4.0 to solr 4.3 that we had a

Re: need distance in miles not in kilometers

2013-07-02 Thread Jack Krupansky
Simply multiply by the number of miles per kilometer, 0.621371: fl=_dist_:mul(geodist(),0.621371) -- Jack Krupansky -Original Message- From: irshad siddiqui Sent: Tuesday, July 02, 2013 5:19 AM To: solr-user@lucene.apache.org Subject: need distance in miles not in kilometers Hi, I

Re: need distance in miles not in kilometers

2013-07-02 Thread irshad siddiqui
Jack , Thanks for your response. In case of frange we donot want to separately multiple for conversion so in that case is there any way to convert it into miles. my Query:

Re: documentCache not used in 4.3.1?

2013-07-02 Thread Daniel Collins
Cheers, its certainly something we might end up exploring. On 2 July 2013 12:41, Erick Erickson erickerick...@gmail.com wrote: This takes some significant custom code, but... One strategy is to keep your commits relatively lengthy (depends on the ingest rate) and keep a side car index

Re: Converting nested data model to solr schema

2013-07-02 Thread Jack Krupansky
It sounds like 4.4 will have an RC next week, so the prospects for block join in 4.4 are kind of dim. I mean, such a significant feature should have more than a few days to bake before getting released. But... who knows what Yonik has planned! -- Jack Krupansky -Original Message-

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
I'm not familiar with block join in lucene. I've read a bit, and I just want to make sure - do you think that when this ticket is released, it will solve the current problem of solr cloud joins? Also, can you elaborate a bit about your solution? Jack Krupansky-2 wrote It sounds like 4.4 will

Re: Spell check in SOLR

2013-07-02 Thread Shalin Shekhar Mangar
See http://wiki.apache.org/solr/SpellCheckComponent On Tue, Jul 2, 2013 at 4:14 PM, Prathik Puthran prathik.puthra...@gmail.com wrote: Hi, How can i configure SOLR to provide corrections for misspelled words. If the query string is in dictionary SOLR should not return any suggestions. But if

DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Andy Pickler
Solr 4.1.0 We've been using the DIH to pull data in from a MySQL database for quite some time now. We're now wanting to strip all the HTML content out of many fields using the HTMLStripTransformer ( http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer). Unfortunately, while it

Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
I have two entities in 1:n relation - PackageVersion and Tag. I have configured DIH to use CachedSqlEntityProcessor and everything works as planned. First, Tag entity is selected using the query attribute. Then the main entity. Ultra Fast. Now I am adding the delta import. Everything runs and

Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Hi guys! I have simple use case to implement but I have problem with date based partitioning... Here are some rules: 1. At the beginning I have to create huge index (10GB) based on one db table. 2. Every day I have to update this index. 3. 99,999% are queries based on date field (*data from last

Re: Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
BTW: Just found out that a delta import is only supported by the SqlEntityProcessor . Does it matter that I defined processor=CachedSqlEntityProcessor? On Tue, Jul 2, 2013 at 5:58 PM, Mysurf Mail stammail...@gmail.com wrote: I have two entities in 1:n relation - PackageVersion and Tag. I have

How to disable debug in Solrj

2013-07-02 Thread Jean-Pierre Lauris
Hi, I'm running the jetty start.jar and I'm indexing documents with Solrj's HttpSolrServer object : SolrServer server = new HttpSolrServer(http://HOST:8983/solr/;); server.add( docs ); server.commit(); This leads to TONS of debug information (i.e. logs at level DEBUG), on both server and client

Re: Solr cloud date based paritioning

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:05, kowish.adamosh kowish.adam...@gmail.com wrote: Hi guys! I have simple use case to implement but I have problem with date based partitioning... Here are some rules: 1. At the beginning I have to create huge index (10GB) based on one db table. 2. Every day I have to

Re: Using per-segment FieldCache or DocValues in custom component?

2013-07-02 Thread Robert Muir
Where do you get the docid from? Usually its best to just look at the whole algorithm, e.g. docids come from per-segment readers by default anyway so ideally you want to access any per-document things from that same segmentreader. As far as supporting docvalues, FieldCache API passes thru to

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:29, Andy Pickler andy.pick...@gmail.com wrote: Solr 4.1.0 We've been using the DIH to pull data in from a MySQL database for quite some time now. We're now wanting to strip all the HTML content out of many fields using the HTMLStripTransformer (

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Andy Pickler
Thanks for the quick reply. Unfortunately, I don't believe my company would want me sharing our exact production schema in a public forum, although I realize it makes it harder to diagnose the problem. The sub-entity is a multi-valued field that indeed does have a relationship to the outer

Re: Solr indexer and Hadoop

2013-07-02 Thread Michael Della Bitta
Yes, I've read directly from NFS. Consider the case where your mapper takes as input a list of the file paths to operate on. Your mapper would load each file one by one by using standard java.io.* calls, build a SolrInputDocument out of each one, and submit it to a SolrServer implementation

Re: Newbie SolR - Need advice

2013-07-02 Thread Jack Krupansky
Start with the Solr Tutorial. http://lucene.apache.org/solr/tutorial.html -- Jack Krupansky -Original Message- From: fabio1605 Sent: Tuesday, July 02, 2013 11:16 AM To: solr-user@lucene.apache.org Subject: Newbie SolR - Need advice Hi we have a MSSQL Server which is just getting

Re: OOM killer script woes

2013-07-02 Thread Mark Miller
Please file a JIRA issue so that we can address this. - Mark On Jul 2, 2013, at 6:20 AM, Daniel Collins danwcoll...@gmail.com wrote: On looking at the code in SolrDispatchFilter, is this intentional or not? I think I remember Mark Miller mentioning that in an OOM case, the best course of

Re: Unique key error while indexing pdf files

2013-07-02 Thread Shalin Shekhar Mangar
See http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor The implicit fields generated by the FileListEntityProcessor are fileDir, file, fileAbsolutePath, fileSize, fileLastModified and these are available for use within the entity On Tue, Jul 2, 2013 at 2:47 PM, archit2112

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
Hi Fabio, Like Jack says, try the tutorial. But to answer your question, SOLR isn't a bolt on to SQLServer or any other DB. It's a fantastically fast indexing/searching tool. You'll need to use the DataImportHandler (see the tutorial) to import your data from the DB into the indices that SOLR

RE: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Thanks guys So SolR is actually a database replacement for mssql...  Am I right  We have a lot of perl scripts that contains lots of sql insert queries. Etc How do we query the SolR database from scripts  I know I have a lot to learn still so excuse my ignorance.  Also...  What

Re: Solr cloud date based paritioning

2013-07-02 Thread Otis Gospodnetic
Hi, There is nothing automatic that I know of that will create shards (or maybe you mean SolrCloud Collections?) every month. You can do that in your application, though, just create the Collection via the API. You can make use of aliases to have something like last2months alias point to your

Re: How to re-index Solr get term frequency within documents

2013-07-02 Thread Otis Gospodnetic
Hi Tony, There is, you can do it with that SolrEntityProcessor I pointed out, if you have all your fields stored in Solr. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 2, 2013 at 2:00 AM, Tony Mullins

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry
Hi Fabio, No, Solr isn't the database replacement for MS SQL. Solr is built on top of Lucene which is a search engine library for text searches. Solr in itself is not a replacement for any database as it does not support any relational db features, however as Jack and David mentioned its fully

set-based and other less common approaches to search

2013-07-02 Thread gilawem
Let's say I wanted to ask solr to find me any document that contains at least 100 out of some 300 search terms I give it. Can Solr do this out of the box? If not, what kind of customization would it require? Now let's say I want to further have the option to request that those terms a) must

Re: set-based and other less common approaches to search

2013-07-02 Thread Otis Gospodnetic
Hi, Solr can do all of these. There are phrase queries, queries where you specify a field, the mm param for min should match, etc. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 2, 2013 at 12:36 PM, gilawem

Tomcat Solr Server startup fails with FileNotFoundException

2013-07-02 Thread Murthy Perla
Hi All, I am newbie to solr. I've accidentally deleted indexed files(manually using rm -rf command) on server from solr index folder. Then on when ever I start my server its failing to start with FNF exception. How can this be fixed quickly? Appreciate if any can suggest a quick fix

Re: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Hi Ok I'm even more confused now...  Sorry for even more stupid questions.  So if it's not a database replacement  Where do we keep the database then.  We have a website that is a documentation website that store documents.  It has over 130 million records in a table and 50

Re: Newbie SolR - Need advice

2013-07-02 Thread Jack Krupansky
Consider DataStax Enterprise - it combines Cassandra for NoSql data storage with Solr for indexing - fully integrated. http://www.datastax.com/ -- Jack Krupansky -Original Message- From: fabio1605 Sent: Tuesday, July 02, 2013 12:44 PM To: solr-user@lucene.apache.org Subject: Re:

Re: Newbie SolR - Need advice

2013-07-02 Thread Shawn Heisey
On 7/2/2013 10:09 AM, fabio1605 wrote: Thanks guys So SolR is actually a database replacement for mssql... Am I right We have a lot of perl scripts that contains lots of sql insert queries. Etc How do we query the SolR database from scripts I know I have a lot to

Re: Newbie SolR - Need advice

2013-07-02 Thread Walter Underwood
Solr is not a database and it does not handle SQL queries. --wunder On Jul 2, 2013, at 9:09 AM, fabio1605 wrote: Thanks guys So SolR is actually a database replacement for mssql... Am I right We have a lot of perl scripts that contains lots of sql insert queries. Etc

Re: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Arrfh I see...  So SolR is the search engine for a datastore  Is that what mongo is.. A datastore bit.  Sent from Samsung Mobile Original message From: Jack Krupansky-2 [via Lucene] ml-node+s472066n4074809...@n3.nabble.com Date: 02/07/2013 17:51 (GMT+00:00) To:

Re: Tomcat Solr Server startup fails with FileNotFoundException

2013-07-02 Thread Shawn Heisey
On 7/2/2013 9:39 AM, Murthy Perla wrote: I am newbie to solr. I've accidentally deleted indexed files(manually using rm -rf command) on server from solr index folder. Then on when ever I start my server its failing to start with FNF exception. How can this be fixed quickly? I believe

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
Don’t worry Fabio - nobody knows everything (apart from Hossman). Following on from Sandeep, to use SOLR, you extract the data from your MSSQL DB using the DataImportHandler and you can then query it, facet it, pivot it to your heart's content. And fast! You can use almost anything to build

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Hello @, This thread 'kicked' me into finishing som long-past task of sending/receiving large boolean (bitset) filter. We have been using bitsets with solr before, but now I sat down and wrote it as a qparser. The use cases, as you have discussed are: - necessity to send lng list of ids as

Re: Concurrent Modification Exception

2013-07-02 Thread adityab
Anyone , any suggestion or pointers for this issue? -- View this message in context: http://lucene.472066.n3.nabble.com/Concurrent-Modification-Exception-tp4074371p4074829.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: set-based and other less common approaches to search

2013-07-02 Thread gilawem
Thanks. So following up on a) below, could I set up and query Solr, without any customization of code, to match 10 of my given 20 terms, but only if it finds those 10 terms in an xls document under a column that is named MyID or My ID or My I.D.? If so, what would that query look like? On Jul

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Wrong link to the parser, should be: https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/java/org/apache/solr/search/BitSetQParserPlugin.java On Tue, Jul 2, 2013 at 1:25 PM, Roman Chyla roman.ch...@gmail.com wrote: Hello @, This thread 'kicked' me into finishing som

How to show just the parent domains from results in Solr

2013-07-02 Thread A Geek
hi All, I've indexed documents in my Solr 4.0 index, with fields like URL, page_content etc. Now when I run a search query, against the page_content I get a lot of urls . And say, if I in total 15 URL domains, and under these 15 domains I've all the pages indexed in SOLR. Is there a way in

Re: Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Thanks! I have very limited response time (max 100ms) therefore sharding is a must. Data also have trend to grow up to tens of gigs. Is there any way how to create new logical shard in runtime? I want to logically partition my data by date. I'm still wondering how is implemented example from

Re: Solr large boolean filter

2013-07-02 Thread Mikhail Khludnev
Hello Roman, Don't you consider to pass long id sequence as body and access internally in solr as a content stream? It makes base64 compression not necessary. AFAIK url length is limited somehow, anyway. On Tue, Jul 2, 2013 at 9:32 PM, Roman Chyla roman.ch...@gmail.com wrote: Wrong link to

Re: set-based and other less common approaches to search

2013-07-02 Thread Mikhail Khludnev
try to hit dismax query parser specifying mm and qf parameters. On Tue, Jul 2, 2013 at 9:31 PM, gilawem mewa...@gmail.com wrote: Thanks. So following up on a) below, could I set up and query Solr, without any customization of code, to match 10 of my given 20 terms, but only if it finds those

Re: Converting nested data model to solr schema

2013-07-02 Thread Mikhail Khludnev
during indexing whole block (doc and it's attachment) goes into particular shard, then it's can be queried per every shard and results are merged. btw, do you feel any problem with your current approach - query time joins and out-of-the-box shard routing? On Tue, Jul 2, 2013 at 5:19 PM, adfel70

copyField and storage requirements

2013-07-02 Thread Ali, Saqib
Newbie question: We have the following fields defined in the schema: field name=content type=text_general indexed=true stored=false/ field name=teaser type=text_general indexed=false stored=true/ copyField source=content dest=teaser maxChars=80/ the content is field is about 500KB data. My

Request to Edit Solr Wiki

2013-07-02 Thread Vivek Shivaprabhu
Hi I'd like to contribute to some of the page in the Solr Wiki at wiki.apache.org/solr My username is VivekShivaprabhu (alias: vivekrs) Please do the needful. Thanks in advance! -Vivek R S

Re: Request to Edit Solr Wiki

2013-07-02 Thread Erick Erickson
Done, added VivekShivaprabhu to the Solr contributor's group. Let us know if you need the alias instead And thanks for helping with the Wiki! Erick On Tue, Jul 2, 2013 at 1:42 PM, Vivek Shivaprabhu vivekrs@gmail.comwrote: Hi I'd like to contribute to some of the page in the Solr

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Roman Chyla
as i discovered, it is not good to use 'native' locktype in this scenario, actually there is a note in the solrconfig.xml which says the same when a core is reloaded and solr tries to grab lock, it will fail - even if the instance is configured to be read-only, so i am using 'single' lock for the

Re: copyField and storage requirements

2013-07-02 Thread Shawn Heisey
On 7/2/2013 12:22 PM, Ali, Saqib wrote: Newbie question: We have the following fields defined in the schema: field name=content type=text_general indexed=true stored=false/ field name=teaser type=text_general indexed=false stored=true/ copyField source=content dest=teaser maxChars=80/

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Hello Mikhail, Yes, GET is limited, but POST is not - so I just wanted that it works in both the same way. But I am not sure if I am understanding your question completely. Could you elaborate on the parameters/body part? Is there no need for encoding of binary data inside the body? Or do you

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
Hmmm, single lock sounds dangerous. It probably works ok because you've been [un]lucky. For example, even with a RO instance, you still need to do a commit in order to reload caches/changes from the other instance. What happens if this commit gets called in the middle of the other instance's

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Roman Chyla
Interesting, we are running 4.0 - and solr will refuse the start (or reload) the core. But from looking at the code I am not seeing it is doing any writing - but I should digg more... Are you sure it needs to do writing? Because I am not calling commits, in fact I have deactivated *all*

Filter cache pollution during sharded edismax queries

2013-07-02 Thread Ken Krugler
Hi all, After upgrading from Solr 3.5 to 4.2.1, I noticed our filterCache hit ratio had dropped significantly. Previously it was at 95+%, but now it's 50%. I enabled recording 100 entries for debugging, and in looking at them it seems that edismax (and faceting) is creating entries for me.

Re: Replicating files containing external file fields

2013-07-02 Thread Arun Rangarajan
Jack and Erick, Thanks for your replies. I am able to replicate ext file fields by specifying the relative paths for each individual file. confFiles in solrconfig.xml is really long now with lot of ../ and I got 5 ext file field files. Would be really nice if wild-cards were supported here :-).

Re: Solr large boolean filter

2013-07-02 Thread Mikhail Khludnev
Roman, It's covered in http://wiki.apache.org/solr/ContentStream | For POST requests where the content-type is not application/x-www-form-urlencoded, the raw POST body is passed as a stream. So, there is no need for encoding of binary data inside the body. Regarding encoding, I have a

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:55, Andy Pickler andy.pick...@gmail.com wrote: Thanks for the quick reply. Unfortunately, I don't believe my company would want me sharing our exact production schema in a public forum, although I realize it makes it harder to diagnose the problem. The sub-entity is a

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
My current solution is overriding the out-of-the-box shard routing, and forcing each document and its attachment to go into a specific shard. But this is so I can support the query time joins (because join are only performed between documents in the same shard). I'm a bit concerned by this

Re: Solr cloud date based paritioning

2013-07-02 Thread Gora Mohanty
On 2 July 2013 22:35, kowish.adamosh kowish.adam...@gmail.com wrote: Thanks! I have very limited response time (max 100ms) therefore sharding is a must. Really? Sharding is a must without any measurements to validate that assertion? I am not sure what advice to give you if you seem determined

Access to Solr Wiki

2013-07-02 Thread Gora Mohanty
Hi, May I please be added to the list of editors to the Solr Wiki as I see that some earlier changes seem to have gone missing. My user name is GoraMohanty Thanks. Regards, Gora

How to query Solr for empty field or specific value

2013-07-02 Thread Van Tassell, Kristian
Hello, I'm using Solr 4.2 and am trying to get a specific value (blue) or null field (no color) returned by my filter query. My results should yield 3 documents (If I execute the two separate filters in different queries, I get 2 hits for one query and 1 for the other). I've tried this (blue

RE: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
So, you keep your mssql database, you just don't use it for searches - that'll relieve some of the load. Searches then all go through SOLR its Lucene indexes. If your various tables need SQL joins, you specify those in the DataImportHandler (DIH) config. That way, when SOLR indexes everything,

Re: copyField and storage requirements

2013-07-02 Thread Ali, Saqib
Thanks Shawn. Here is the text_general type definition. We would like to bring down the storage requirement down to a minimum for those 500KB content documents. We just need basic full-text search. Thanks!!! :) fieldType name=text_general class=solr.TextField positionIncrementGap=100

Re: How to query Solr for empty field or specific value

2013-07-02 Thread Jack Krupansky
Better to define color.not_null as a boolean field and always initialize as either true or false. But, even without that you need write a pure negative query or clause as (*:* -term) So: select?q=*:*fq=((*:* -color:[* TO *]) OR color:blue) and select?q=*:*fq=((*:*

Re: Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Sure, I'ill measure results and come back if results will be unsatisfactory. Thanks very much for advice. Out of curiosity: is there any way to partition shards (logical and physical) by specified value of specified field? Kowish -- View this message in context:

Re: How to show just the parent domains from results in Solr

2013-07-02 Thread Jack Krupansky
Re-index your data with a separate field for domain name, then either manually populate it or use an update processor to extract the domain name and store it in the desired field. You can then group by that field. The URL Classify update processor can do the trick. Or maybe a custom script

RE: How to query Solr for empty field or specific value

2013-07-02 Thread Van Tassell, Kristian
Thank you! -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, July 02, 2013 3:05 PM To: solr-user@lucene.apache.org Subject: Re: How to query Solr for empty field or specific value Better to define color.not_null as a boolean field and always

What are the options for obtaining IDF at interactive speeds?

2013-07-02 Thread Kathryn Mazaitis
Hi, I'm using SOLRJ to run a query, with the goal of obtaining: (1) the retrieved documents, (2) the TF of each term in each document, (3) the IDF of each term in the set of retrieved documents (TF/IDF would be fine too) ...all at interactive speeds, or 10s per query. This is a demo, so if all

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
The RO instance commit isn't (or shouldn't be) doing any real writing, just an empty commit to force new searchers, autowarm/refresh caches etc. Admittedly, we do all this on 3.6, so 4.0 could have different behaviour in this area. As long as you don't have autocommit in solrconfig.xml, there

Partial Matching in both query and field

2013-07-02 Thread James Bathgate
Given a string of 123456 and a search query 923459, what should the schema look like to consider this a match because at least 4 consecutive in characters the query match 4 consecutive characters in the data? I'm trying an NGramFilterFactory on the index and NGramTokenizerFactory on the query in

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry
Hi Fabio, Yes, you're on right track. I'd like to now direct you to first reply from Jack to go through solr tutorial. Even with Solr,, it will take some time to learn various bits and pieces about designing fields, their field types, server configuration, etc. and then tune the results to match

Re: Partial Matching in both query and field

2013-07-02 Thread Jack Krupansky
You will need to set q.op to OR, and... use a field type that has the autoGeneratePhraseQueries attribute set to false. -- Jack Krupansky -Original Message- From: James Bathgate Sent: Tuesday, July 02, 2013 5:10 PM To: solr-user@lucene.apache.org Subject: Partial Matching in both

Re: Partial Matching in both query and field

2013-07-02 Thread James Bathgate
Jack, I've already tried that, here's my query: str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:20454/str str name=q.opOR/str str name=rows10/str str name=version2.2/str Here's the parsed query: str name=parsedquery_toString0_extrafield1_n:2o45

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Michael Della Bitta
Wouldn't it be better to do a RELOAD? http://wiki.apache.org/solr/CoreAdmin#RELOAD Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions

Re: Partial Matching in both query and field

2013-07-02 Thread Jack Krupansky
Ahhh... you put autoGeneratePhraseQueries=false on the field - but it needs to be on the field type. You can see from the parsed query that it generated the phrase. -- Jack Krupansky -Original Message- From: James Bathgate Sent: Tuesday, July 02, 2013 5:35 PM To:

Re: Access to Solr Wiki

2013-07-02 Thread Steve Rowe
I've added GoraMohanty to the Solr wiki's ContributorsGroup page. - Steve On Jul 2, 2013, at 3:25 PM, Gora Mohanty g...@mimirtech.com wrote: Hi, May I please be added to the list of editors to the Solr Wiki as I see that some earlier changes seem to have gone missing. My user name is

  1   2   >