Re: solr reporting tool adapter

2009-10-07 Thread Shalin Shekhar Mangar
On Tue, Oct 6, 2009 at 1:09 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, i wanted to query solr and send the output some reporting tool. has anyone done something like that? moreover, which reporting filter is good?? ny suggesstions? Can you be more specific on what you want to

Re: solr optimize - no space left on device

2009-10-07 Thread Shalin Shekhar Mangar
Not sure but a quick search turned up: http://www.walkernews.net/2007/07/13/df-and-du-command-show-different-used-disk-space/ Using upto 2x the index size can happen. Also check if there is a snapshooter script running through cron which is making hard links to files while a merge is in progress.

datadir configuration

2009-10-07 Thread clico
hello As I try to deploy my app on a tomcat server, I'd like to custome datadir variable outside the solrconfig.xml file. Is there a way to custom it in a context file? Thanks -- View this message in context: http://www.nabble.com/datadir-configuration-tp25782469p25782469.html Sent from the

Re: datadir configuration

2009-10-07 Thread Gasol Wu
Hi, add JAVA_OPTS variable in TOMCAT_HOME/bin/catalina.sh like below, JAVA_OPTS=$JAVA_OPTS -Dsolr.home=/opt/solr -Dsolr.foo.data.dir=/opt/solr/data solr.data.dir must mapping to dataDir in solrconfig.xml here is example (solrconfig.xml):

Doing SpellCheck in distributed search

2009-10-07 Thread balaji.a
Hi All, I am trying to get spell check suggestions in my distributed search query using shards. I have 2 cores configured core0 and core1 both having spell check component configured. On requesting search result using the following query I don't get the spelling suggestions.

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-07 Thread Shalin Shekhar Mangar
On Tue, Oct 6, 2009 at 4:33 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi all, from reading through previous posts on that subject, it seems like the accent filter has to come before the snowball filter. I'd just like to make sure this is so. If it is the case, I'm

Re: Questions about synonyms and highlighting

2009-10-07 Thread Shalin Shekhar Mangar
I'm not an expert on hit highlighting but please find some answers inline: On Wed, Sep 30, 2009 at 9:03 PM, Nourredine K. nourredin...@yahoo.comwrote: Hi, Can you please give me some answers for those questions : 1 - How can I get synonyms found for a keyword ? I mean i search foo and i

Re: solr reporting tool adapter

2009-10-07 Thread Rakhi Khatwani
we basically wanna generate PDF reports which contain, tag clouds, bar charts, pie charts etc. Regards, Raakhi On Wed, Oct 7, 2009 at 1:28 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Oct 6, 2009 at 1:09 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, i wanted

Re: datadir configuration

2009-10-07 Thread clico
What do I put in dataDir${solr.foo.data.dir:/default/path/to/datadir}/dataDir ? What is /default/path/to/datadir? Gasol Wu wrote: Hi, add JAVA_OPTS variable in TOMCAT_HOME/bin/catalina.sh like below, JAVA_OPTS=$JAVA_OPTS -Dsolr.home=/opt/solr -Dsolr.foo.data.dir=/opt/solr/data

Re: Solr Timeouts

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:19 AM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: What does the maxCommitsToKeep(from SolrDeletionPolicy in SolrConfig.xml) parameter actually do? Increasing this value seems to have helped a little, but I'm wary of cranking it without having

Re : Questions about synonyms and highlighting

2009-10-07 Thread Nourredine K.
I'm not an expert on hit highlighting but please find some answers inline: Thanks Shalin for your answers. It helps a lot. I post again questions #3 and #4 for the others :) 3 - Is it possible and if so How can I configure solR to set or not highlighting for tokens with diacritics ?

Re: Indexing and searching of sharded/ partitioned databases and tables

2009-10-07 Thread Shalin Shekhar Mangar
Comments inline: On Wed, Oct 7, 2009 at 2:01 PM, Jayant Kumar Gandhi jaya...@gmail.comwrote: Lets say I have 3 mysql databases each with 3 tables. Db1 : Tbl1, Tbl2, Tbl3 Db2 : Tbl1, Tbl2, Tbl3 Db3 : Tbl1, Tbl2, Tbl3 All databases have the same number of tables with same table names as

Re: Doing SpellCheck in distributed search

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:14 PM, balaji.a reachbalaj...@gmail.com wrote: Hi All, I am trying to get spell check suggestions in my distributed search query using shards. SpellCheckComponent does not support distributed search yet. There is an issue open with a patch. If you decide to use,

Re: solr reporting tool adapter

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:51 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: we basically wanna generate PDF reports which contain, tag clouds, bar charts, pie charts etc. Faceting on a field will give you top terms and frequency information which can be used to create tag clouds. What do you

Re: datadir configuration

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:56 PM, clico cl...@mairie-marseille.fr wrote: What do I put in dataDir${solr.foo.data.dir:/default/path/to/datadir}/dataDir ? What is /default/path/to/datadir? Solr variables are written like: ${variable_name:default_value} If you are configuring the dataDir as

Re: Solr Quries

2009-10-07 Thread Shalin Shekhar Mangar
First, please do not cross-post messages to both solr-dev and solr-user. Solr-dev is only for development related discussions. Comments inline: On Wed, Oct 7, 2009 at 9:59 AM, Pravin Karne pravin_ka...@persistent.co.inwrote: Hi, I am new to solr. I have following queries : 1. Is solr

SpellCheck with filter/conditions

2009-10-07 Thread R. Tan
Sorry, newbie here, figured it out. How do you get spelling suggestions on a specific resultset, filtered by a certain facet for example? On Wed, Oct 7, 2009 at 8:43 AM, R. Tan tanrihae...@gmail.com wrote: Nice. In comparison, how do you do it with faceting? Two other approaches are to use

Re: Re : Questions about synonyms and highlighting

2009-10-07 Thread Avlesh Singh
4 - the same question for highlighting with lemmatisation? Settings for manage (all highlighted) == the two wordsemmanage/em and emmanagement/em are highlighted Settings for manage == the first word emmanage/em is highlighted but not the second : management There is no Lemmatisation

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-07 Thread Mint Ekalak
I run solr successfully until i updated recently and dead at this line where ImportTime '${dataimporter.last_index_time}' from data-import.xml i got this error org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select * from newheader where ImportTime 'Wed

ApacheCon US

2009-10-07 Thread Grant Ingersoll
Just a friendly reminder to all about Lucene ecosystem events at ApacheCon US this year. We have two days of talks on pretty much every project under Lucene (see http://lucene.apache.org/#14+August+2009+-+Lucene+at+US+ApacheCon ) plus a meetup and a two day training on Lucene and a 1 day

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-07 Thread Chantal Ackermann
See http://markmail.org/message/hi25u5iqusfu542b Thank you for the link, Shalin! It could be worth copying that to the wiki? Cheers! Chantal I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly,

Re: Solr Quries

2009-10-07 Thread Sandeep Tagore
Hi Pravin, 1. Is solr work in distributed environment ? if yes, how to configure it? Yep. You can achieve this with Sharding. For example: Install and Configure Solr on two machines and declare any one of those as master. Insert shard parameters while you index and search your data. 2. Is solr

Re: Doing SpellCheck in distributed search

2009-10-07 Thread balaji.a
Thanks Shalin! I applied your patch and deployed the war. While debugging the overridden method SpellCheckComponent.finishStage is not getting invoked by the SearchHandler. Instead its invoking the SearchComponent.finishStage method. Do I need to configure anything extra to make it work? My

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 3:53 PM, Mint Ekalak mint@gmail.com wrote: I run solr successfully until i updated recently and dead at this line where ImportTime '${dataimporter.last_index_time}' from data-import.xml i got this error

Re: Indexing and searching of sharded/ partitioned databases and tables

2009-10-07 Thread Sandeep Tagore
Hi Jayant, You can use Solr to achieve your objective. The data-config.xml which you posted is incomplete. I would like to suggest you a way to index the full data. Try to index a database at a time. Sample xml conf. dataSource type=JdbcDataSource name=ds1 driver=com.mysql.jdbc.Driver

Re: Doing SpellCheck in distributed search

2009-10-07 Thread balaji.a
Sorry! it was my mistake of not copying the war at correct location. balaji.a wrote: Thanks Shalin! I applied your patch and deployed the war. While debugging the overridden method SpellCheckComponent.finishStage is not getting invoked by the SearchHandler. Instead its invoking the

Re: Indexing and searching of sharded/ partitioned databases and tables

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 5:09 PM, Sandeep Tagore sandeep.tag...@gmail.comwrote: Hi Jayant, You can use Solr to achieve your objective. The data-config.xml which you posted is incomplete. Sandeep, the data-config that Jayant posted is not incomplete. The field declaration is not necessary if

Re: Indexing and searching of sharded/ partitioned databases and tables

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 5:09 PM, Sandeep Tagore sandeep.tag...@gmail.comwrote: You can write an automated program which will change the DB conf details in that xml and fire the full import command. You can use http://localhost:8983/solr/dataimport url to check the status of the data import.

Re : Re : Questions about synonyms and highlighting

2009-10-07 Thread Nourredine K.
Thanks Avlesh. Now, I understand better how higtlighting works. As you've said, since it is based on the analysers, higtlighting will handle things like search. A precision about #3 and #4 examples , they are exclusives : I wanted to know how to do higtlighting with stemming OR without (not

Re: datadir configuration

2009-10-07 Thread clico
I tried this in my context.xml It doesn't work Environment name=solr/home type=java.lang.String value=D:\workspace\solr\home override=true / Environment name=solr.data.dir

manage rights

2009-10-07 Thread clico
Hi everybody As I'm ready to deploy my solr server (after many tests and use cases) I'd like ton configure my server in order that some request cannot be post As an example : My CMS or data app can use - dataimport - and other indexing commands My website can only perform a search on the

Re: solr optimize - no space left on device

2009-10-07 Thread Phillip Farber
All, We're puzzled why we're still unable to optimize a 192GB index on a LVM volume that has 406GB available. We are not using Solr distribution. There is no snapshooter in the picture. We run out of disk capacity with a df showing 100% but a du showing just 379GB of files. Restarting

Re: Problems with DIH XPath flatten

2009-10-07 Thread Adam Foltzer
Here's a sample: ?xml version=1.0 encoding=ISO-8859-1? !DOCTYPE document [ !ENTITY nbsp #160; !ENTITY copy #169; !ENTITY reg #174; ] document kbml version=-//Indiana University//DTD KBML 0.9//EN kbqIn Mac OS X, how do I enable or disable the firewall?/kbq body pkbh docid=aghe

Re: Solr Trunk Heap Space Issues

2009-10-07 Thread Jeff Newburn
Here is what I discovered after dozens of reindexes. We have a tool that is pulling all of the documents' uniqueIds. This tools is causing the cache to fill up. We turned it off and the system was able to reindex. Here is what is still puzzling to me about this entire scenario. When we had

Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-10-07 Thread Nick Dimiduk
Hey PNW Clouders! I'd really like to chat further with the crew doing distributed Solr. Give me a ring or shoot me an email, let's do lunch! -Nick On Wed, Sep 30, 2009 at 2:10 PM, Nick Dimiduk ndimi...@gmail.com wrote: As Bradford is out of town this evening, I will take up the mantel of

Re: How to retrieve the index of a string within a field?

2009-10-07 Thread Elaine Li
Hi Sandeep, Say the field field name=sentenceCan you get what you want?/field, the field type is Text. My query contains 'sentence:get what you'. Is it possible to get number 2 directly from a query since the word 'get' is the 2nd token in the sentence? Thanks. Elaine On Wed, Oct 7, 2009 at

Re: How to retrieve the index of a string within a field?

2009-10-07 Thread Sandeep Tagore
Hi Elaine, You can achieve that with some modifications in sol configuration files. Generally text will be configured as fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter

Re: Why isn't the DateField implementation of ISO 8601 broader?

2009-10-07 Thread Tricia Williams
Chris Hostetter wrote: : I would expect field:2001-03 to be a hit on a partial match such as : field:[2001-02-28T00:00:00Z TO 2001-03-13T00:00:00Z]. I suppose that my : expectation would be that field:2001-03 would be counted once per day for each : day in its range. It would follow that a user

how can I use debugQuery if I have extended QParserPlugin?

2009-10-07 Thread gdeconto
in a previous post, I asked how I would go about creating virtual function in my solr query; ie: http://127.0.0.1:8994/solr/select...@myfunc(1,2,3,4) I was trying to find a way to more easily/cleanly perform queries against large numbers of dynamic fields (ie field1, field2, field3...field99).

IndexWriter InfoStream in solrconfig not working

2009-10-07 Thread Burton-West, Tom
Hello, We are trying to debug an indexing/optimizing problem and have tried setting the infoStream file in solrconf.xml so that the SolrIndexWriter will write a log file. Here is our setting: !-- To aid in advanced debugging, you may turn on IndexWriter debug logging. Uncommenting

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 12:51 PM, Phillip Farber pfar...@umich.edu wrote: In a separate thread, I've detailed how an optimize is taking 2x disk space. We don't use solr distribution/snapshooter.  We are using the default deletion policy = 1. We can't optimize a 192G index in 400GB of space.

Re: TermsComponent or auto-suggest with filter

2009-10-07 Thread Jay Hill
Something like this, building on each character typed: facet=onfacet.field=tc_queryfacet.prefix=befacet.mincount=1 -Jay http://www.lucidimagination.com On Tue, Oct 6, 2009 at 5:43 PM, R. Tan tanrihae...@gmail.com wrote: Nice. In comparison, how do you do it with faceting? Two other

Facet query pb

2009-10-07 Thread clico
Hello I have a pb trying to retrieve a tree with facet use I 've got a field location_field Each doc in my index has a location_field Location field can be continent/country/city I have 2 queries: http://server/solr//select?fq=(location_field:NORTH*) : ok, retrieve docs

Re: How much disk space does optimize really take

2009-10-07 Thread Jason Rutherglen
It would be good to be able to commit without opening a new reader however with Lucene 2.9 the segment readers for all available segments are already created and available via getReader which manages the reference counting internally. Using reopen redundantly creates SRs that are already held

Re: Facet query pb

2009-10-07 Thread Avlesh Singh
I have no idea what pb mean but this is what you probably want - fq=(location_field:(NORTH AMERICA*)) Cheers Avlesh On Wed, Oct 7, 2009 at 10:40 PM, clico cl...@mairie-marseille.fr wrote: Hello I have a pb trying to retrieve a tree with facet use I 've got a field location_field Each doc

Re: Facet query pb

2009-10-07 Thread Christian Zambrano
Clico, Because you are doing a wildcard query, the token 'AMERICA' will not be analyzed at all. This means that 'AMERICA*' will NOT match 'america'. On 10/07/2009 12:30 PM, Avlesh Singh wrote: I have no idea what pb mean but this is what you probably want - fq=(location_field:(NORTH

Re: How much disk space does optimize really take

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 10:45 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: It would be good to be able to commit without opening a new reader however with Lucene 2.9 the segment readers for all available segments are already created and available via getReader which manages the

Re: How much disk space does optimize really take

2009-10-07 Thread Mark Miller
I think that argument requires auto commit to be on and opening readers after the optimize starts? Otherwise, the optimized version is not put into place until a commit is called, and a Reader won't see the newly merged segments until then - so the original index is kept around in either case -

RE: IndexWriter InfoStream in solrconfig not working

2009-10-07 Thread Giovanni Fernandez-Kincade
I had the same problem. I'd be very interested to know how to get this working... -Gio. -Original Message- From: Burton-West, Tom [mailto:tburt...@umich.edu] Sent: Wednesday, October 07, 2009 12:13 PM To: solr-user@lucene.apache.org Subject: IndexWriter InfoStream in solrconfig not

Default query parameter for one core

2009-10-07 Thread Michael
I'd like to have 5 cores on my box. core0 should automatically shard to cores 1-4, which each have a quarter of my corpus. I tried this in my solrconfig.xml: requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str

Re: How much disk space does optimize really take

2009-10-07 Thread Phillip Farber
Yonik Seeley wrote: Does this means that there's always a lucene IndexReader holding segment files open so they can't be deleted during an optimize so we run out of disk space 2x? Yes. A feature could probably now be developed now that avoids opening a reader until it's requested. That

Re: Facet query pb

2009-10-07 Thread Todd Benge
Aq On 10/7/09, clico cl...@mairie-marseille.fr wrote: Hello I have a pb trying to retrieve a tree with facet use I 've got a field location_field Each doc in my index has a location_field Location field can be continent/country/city I have 2 queries:

Re: How much disk space does optimize really take

2009-10-07 Thread Jason Rutherglen
To be clear, the SRs created by merges don't have the term index loaded which is the main cost. One would need to use IndexReaderWarmer to load the term index before the new SR becomes a part of SegmentInfos. On Wed, Oct 7, 2009 at 10:34 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

Re: How to retrieve the index of a string within a field?

2009-10-07 Thread Elaine Li
Sandeep, I do get results when I search for get what you, not 0 results. What in my schema makes this difference? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example,

Re: Question about PatternReplace filter and automatic Synonym generation

2009-10-07 Thread Prasanna Ranganathan
On 10/6/09 3:32 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I ll try to explain with an example. Given the term 'it!' in the title, it : should match both 'it' and 'it!' in the query as an exact match. Currently, : this is done by using a synonym entry (and index time

How to determine the size of the index?

2009-10-07 Thread Fishman, Vladimir
Is this info available via admin page?

Re: Facet query pb

2009-10-07 Thread clico
That's not a pb I want to use that in order to drill down a tree Christian Zambrano wrote: Clico, Because you are doing a wildcard query, the token 'AMERICA' will not be analyzed at all. This means that 'AMERICA*' will NOT match 'america'. On 10/07/2009 12:30 PM, Avlesh Singh wrote:

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 1:50 PM, Phillip Farber pfar...@umich.edu wrote: So this implies that for a normal optimize, in every case, due to the Searcher holding open the existing segment prior to optimize that we'd always need 3x even in the normal case. This seems wrong since it is repeated

Re: How much disk space does optimize really take

2009-10-07 Thread Michael McCandless
On Wed, Oct 7, 2009 at 1:34 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Oct 7, 2009 at 10:45 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: It would be good to be able to commit without opening a new reader however with Lucene 2.9 the segment readers for all

Re: How much disk space does optimize really take

2009-10-07 Thread Phillip Farber
Wow, this is weird. I commit before I optimize. In fact, I bounce tomcat before I optimize just in case. It makse sense, as you say, that then the open reader can only be holding references to segments that wouldn't be deleted until the optimize is complete anyway. But we're still exceeding

Re: How much disk space does optimize really take

2009-10-07 Thread Lance Norskog
Oops, send before finished. Partial Optimize aka maxSegments is a recent Solr 1.4/Lucene 2.9 feature. As to 2x v.s. 3x, the general wisdom is that an optimize on a simple index takes at most 2x disk space, and on a compound index takes at most 3x. Simple is the default (*). At Divvio we had the

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:16 PM, Phillip Farber pfar...@umich.edu wrote: Wow, this is weird.  I commit before I optimize.  In fact, I bounce tomcat before I optimize just in case. It makse sense, as you say, that then the open reader can only be holding references to segments that wouldn't be

Re: How much disk space does optimize really take

2009-10-07 Thread Mark Miller
I can't tell why calling a commit or restarting is going to help anything - or why you need more than 2x in any case. The only reason i can see this being is if you have turned on auto-commit. Otherwise the Reader is *always* only referencing what would have to be around anyway. Your likely to

Re: How much disk space does optimize really take

2009-10-07 Thread Mark Miller
Okay - I think I've got you - your talking about the case of adding a bunch of docs, not calling commit, and then trying to optimize. I keep coming at it from a cold optimize. Making sense to me now. Mark Miller wrote: I can't tell why calling a commit or restarting is going to help anything -

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:31 PM, Mark Miller markrmil...@gmail.com wrote: I can't tell why calling a commit or restarting is going to help anything Depends on what scenarios you consider, and what you are taking 2x of. 1) Open reader on index 2) Open writer and add two documents... the first

Solr Demo at SF New Tech Meetup

2009-10-07 Thread Nasseam Elkarra
Hello all, For those of you in the Bay Area, we will be demoing our Bodukai Boutique product at the SF New Tech Meetup on Wednesday, Oct. 14: http://sfnewtech.com/2009/10/05/1014-sf-new-tech-bodukai-yourversion-meehive-and-more/ Bodukai Boutique is the fastest ecommerce search and navigation

Re: manage rights

2009-10-07 Thread Lance Norskog
There are no security features in Solr 1.4. You cannot do this. It would be really simple to implement a hack where all management must be done via POST, and then allow the configuration to ban POST requests. On 10/7/09, clico cl...@mairie-marseille.fr wrote: Hi everybody As I'm ready to

Re: solr reporting tool adapter

2009-10-07 Thread Lance Norskog
The BIRT project can do what you want. It has a nice form creator and you can configure http XML input formats. It includes very complete Eclipse plugins and there is a book about it. On 10/7/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Oct 7, 2009 at 2:51 PM, Rakhi Khatwani

Re: How much disk space does optimize really take

2009-10-07 Thread Mark Miller
Yonik Seeley wrote: On Wed, Oct 7, 2009 at 3:31 PM, Mark Miller markrmil...@gmail.com wrote: I can't tell why calling a commit or restarting is going to help anything Depends on what scenarios you consider, and what you are taking 2x of. 1) Open reader on index 2) Open writer and

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-07 Thread michael8
2 things I noticed that are different from 1.3 to 1.4 for DataImport: 1. there are now 2 datetime values (per my specific schema I'm sure) in the dataimport.properties vs. only 1 in 1.3 (using the exact same schema). One is 'last_index_time' same as 1.3, and a *new* one (in 1.4) named

RE: Help with denormalizing issues

2009-10-07 Thread Eric Reeves
Hi again, I'm gonna try this again with more focus this time :D 1) Ideally what we would like to do, is plug in an additional mechanism to filter the initial result set, because we can't find a way to implement our filtering needs as filter queries against a single index. We would want to do

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:56 PM, Mark Miller markrmil...@gmail.com wrote: I guess you can't guarantee 2x though, as if you have queries coming in that take a while, a commit opening a new Reader will not guarantee the old Reader is quite ready to go away. Might want to wait a short bit after

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-07 Thread Shalin Shekhar Mangar
On Thu, Oct 8, 2009 at 1:38 AM, michael8 mich...@saracatech.com wrote: 2 things I noticed that are different from 1.3 to 1.4 for DataImport: 1. there are now 2 datetime values (per my specific schema I'm sure) in the dataimport.properties vs. only 1 in 1.3 (using the exact same schema). One

Re: manage rights

2009-10-07 Thread Grant Ingersoll
You should also separate your indexer from your searcher and make the searcher request handlers allow search only (remove the handlers you don't need). You could also lock down the request parameters that they take, too, by using invariants, etc. Have a look in your solrconfig.xml. You

Re: Help with denormalizing issues

2009-10-07 Thread Lance Norskog
The separate sku do not become one long text string. They are separate values in the same field. The relevance calculation is completely separate per value. The performance problem with the field collapsing patch is that it does the same thing as a facet or sorting operation: it does a sweep

Problems with WordDelimiterFilterFactory

2009-10-07 Thread Bernadette Houghton
We are having some issues with our solr parent application not retrieving records as expected. For example, if the input query includes a colon (e.g. hot and cold: temperatures), the relevant record (which contains a colon in the same place) does not get retrieved; if the input query does not

Re: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Christian Zambrano
Could you please provide the exact URL of a query where you are experiencing this problem? eg(Not URL encoded): q=fieldName:hot and cold: temperatures On 10/07/2009 05:32 PM, Bernadette Houghton wrote: We are having some issues with our solr parent application not retrieving records as

Re: Indexing and searching of sharded/ partitioned databases and tables

2009-10-07 Thread Jayant Kumar Gandhi
Thanks guys. Now I can easily search thru 10TB of my personal photos, videos, music and other stuff :) At some point I had split them into multiple db and tables and inserts to a single db/ table were taking too much time once the index grew beyond 1gig. I was storing all the possible metadata

RE: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Bernadette Houghton
Hi Christian, try this one - http://www.deakin.edu.au/dro/view/DU:3601 Either scroll down and click one of the television broadcasting -- asia links, or type it in the Quick Search box. TIA bern -Original Message- From: Christian Zambrano [mailto:czamb...@gmail.com] Sent:

Snapshot is not created when I added spellchecker with buildOnCommit

2009-10-07 Thread marklo
i've enabled the snapshooter to run after commit and it's working fine until i've added a spellchecker with buildOnCommit = true... Any idea why? Thanks updateHandler class=solr.DirectUpdateHandler2 listener event=postCommit class=solr.RunExecutableListener str

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-07 Thread Jay Hill
Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com On Wed, Oct 7, 2009 at 1:44 AM, Shalin Shekhar Mangar

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-07 Thread Koji Sekiguchi
No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory whichever you'd like. Koji Jay Hill wrote: Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of:

Re: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Christian Zambrano
Bern, I am interested on the solr query. In other words, the query that your system sends to solr. Thanks, Christian On Oct 7, 2009, at 5:56 PM, Bernadette Houghton bernadette.hough...@deakin.edu.au wrote: Hi Christian, try this one - http://www.deakin.edu.au/dro/view/DU:3601

Re: Problems with WordDelimiterFilterFactory

2009-10-07 Thread marklo
Use http://solr-url/solr/admin/analysis.jsp to see how your data is indexed/queried -- View this message in context: http://www.nabble.com/Problems-with-WordDelimiterFilterFactory-tp25795589p25797377.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-07 Thread Mint Ekalak
Work like a charm !! thanks Shalin Regards, Mint Shalin Shekhar Mangar wrote: On Thu, Oct 8, 2009 at 1:38 AM, michael8 mich...@saracatech.com wrote: 2 things I noticed that are different from 1.3 to 1.4 for DataImport: 1. there are now 2 datetime values (per my specific schema

Re: TermsComponent or auto-suggest with filter

2009-10-07 Thread R. Tan
Thanks Jay. What's a good way of extracting the original text from here? On Thu, Oct 8, 2009 at 1:03 AM, Jay Hill jayallenh...@gmail.com wrote: Something like this, building on each character typed: facet=onfacet.field=tc_queryfacet.prefix=befacet.mincount=1 -Jay

Scoring for specific field queries

2009-10-07 Thread R. Tan
Hi, How can I get wildcard search (e.g. cha*) to score documents based on the position of the keyword in a field? Closer (to the start) means higher score. For example, I have multiple documents with titles containing the word champion. Some of the document titles start with the word champion and

Re: How to determine the size of the index?

2009-10-07 Thread Sandeep Tagore
Are you referring to schema info ??? You can find it at http://192.168.5.25/solr/admin/file/?file=schema.xml and http://192.168.5.25/solr/admin/schema.jsp Fishman, Vladimir wrote: Is this info available via admin page? -- View this message in context:

Re: Scoring for specific field queries

2009-10-07 Thread Avlesh Singh
You would need to boost your startswith matches artificially for the desired behavior. I would do it this way - 1. Create a KeywordTokenized field with n-gram filter. 2. Create a Whitespace tokenized field with n-gram flter. 3. Search on both the fields, boost matches for #1 over #2.

Re: How to retrieve the index of a string within a field?

2009-10-07 Thread Sandeep Tagore
Elaine, The field type text contains tokenizer class=solr.WhitespaceTokenizerFactory/ in its definition. So all the sentences that are indexed / queried will be split in to words. So when you search for 'get what you', you will get sentences containing get, what, you, get what, get you, what you,

Re: Scoring for specific field queries

2009-10-07 Thread Sandeep Tagore
Hi Rihaed, I guess we don't need to depend on scores all the times. You can use custom sort to sort the results. Take a dynamicField, fill it with indexOf(keyword) value, sort the results by the field in ascending order. Then the records which contain the keyword at the earlier position will come

Re: Scoring for specific field queries

2009-10-07 Thread Avlesh Singh
I guess we don't need to depend on scores all the times. You can use custom sort to sort the results. Take a dynamicField, fill it with indexOf(keyword) value, sort the results by the field in ascending order. Then the records which contain the keyword at the earlier position will come

Re: delay while adding document to solr index

2009-10-07 Thread swapna_here
thanks for your reply but sorry for the delay as you said i have removed the commit while adding single document and set the auto commit for maxDocs200/maxDocs maxTime1/maxTime after setting when i run optimize() manually the size decreased to 350MB(10 docs) from

RE: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Sandeep Tagore
Hi Bern, I indexed some records with - and : today using your configuration and I searched with following urls http://localhost/solr/select?q=CONTENT:cold : temperature http://localhost/solr/select?q=CONTENT:cold: temperature http://localhost/solr/select?q=CONTENT:cold :temperature