Re: Solr - Tomcat new versions

2012-01-17 Thread Alessio Crisantemi
Hi, I installed Apache tomct on Windows (Vista) and Solr. But I have any problem between Tomcat 7.0.23 and Solr 3.5 No problem if I install Solr 1.4.1 with the same version of Tomcat. (I check it with binary and source code installation for omcat but the result is the same). It's a bug, I

Re: SolrJ Embedded

2012-01-17 Thread Maxim Veksler
On Tue, Jan 17, 2012 at 3:13 AM, Erick Erickson erickerick...@gmail.comwrote: I don't see why not. I'm assuming a *nix system here so when Solr updated an index, any deleted files would hang around. But I have to ask why bother with the Embedded server in the first place? You already have a

Re: Query regarding solr custom sort order

2012-01-17 Thread umaswayam
Hi, Let me clarify the situation here in details. The default sort which Websphere commerce provide is based on name price of any item. but we are having unique values of every item. hence sorting goes on fine either as intger or as string but while preprocess we generate some temporary tables

Re: Solr - Tomcat new versions

2012-01-17 Thread Luca Cavanna
Hi Alessio, I've seen Solr 3.5 running within Tomcat 7.0.23, it shouldn't be a bug I guess. Could you please provide some more details about the problem you have? Do you have a stacktrace? Are you upgrading an existing Solr 1.4.1, right? By the way, which jdk are you using? Thanks Luca On Tue,

Re: Solr - Tomcat new versions

2012-01-17 Thread Alessio Crisantemi
Dear Luca, I follow the Solr installation procedures signed on Official guide, but with Solr 3,5 don't works. While with solr 1.4.1 it's all right. I don't know why...but now I work with Solr 1.4.1 and more: I would install TIKA 1.0 on Solr 1.4.1. Is possible? How can i do? can you help me?

Re: FacetComponent: suppress original query

2012-01-17 Thread Dmitry Kan
Yes, that's what I have started to use already. Probably, this is the easiest solution. Thanks. On Tue, Jan 17, 2012 at 3:03 AM, Erick Erickson erickerick...@gmail.comwrote: Why not just up the maxBooleanClauses parameter in solrconfig.xml? Best Erick On Sat, Jan 14, 2012 at 1:41 PM,

Re: Solr - Tomcat new versions

2012-01-17 Thread Luca Cavanna
Hi Alessio, in order to help you, we'd need to know something more about what's going wrong. Could you give us a stacktrace or an error you're reading? How do you know solr isn't working? Thanks Luca On Tue, Jan 17, 2012 at 10:52 AM, Alessio Crisantemi alessio.crisant...@gioconews.it wrote:

really slow performance when trying to get facet.field

2012-01-17 Thread Daniel Bruegge
Hi, I have 2 Solr-shards. One is filled with approx. 25mio documents (local index 6GB), the other with 10mio documents (2.7GB size). I am trying to create some kind of 'word cloud' to see the frequency of words for a *text_general *field. For this I am currently using a facet over this field and

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Dmitry Kan
I had a similar problem for a similar task. And in my case merging the results from two shards turned out to be a culprit. If you can logically store your data just in one shard, your faceting should become faster. Size wise it should not be a problem for SOLR. Also, you didn't say anything about

Re: Solr - Tomcat new versions

2012-01-17 Thread Erik Hatcher
Perhaps this the known issue with the 3.5 example schema being used in Tomcat and the VelocityResponseWriter issue? I'm on my mobile now so don't have easy access to a pointer with details but check the archives if this seems to be the issue on how to resolve it. Erik On Jan 17, 2012,

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Daniel Bruegge
Hi Dmitry, I had everything on one Solr Instance before, but this got to heavy and I had the same issue here, that the 1st facet.query was really slow. When querying the facet: - facet.limit = 100 Cache settings are like this: filterCache class=solr.FastLRUCache size=16384

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Dmitry Kan
Hi Daniel, My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is over 6,5 million. Do you see any evictions in your caches? What kind

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Daniel Bruegge
Evictions are 0 for all cache types. Your server max heap space with 12G is pretty huge. Which is good I think. The CPU on my server is a 8-Core Intel i7 965. Commit frequency is low, because shards are added and old shards exist for historical reasons. Old shards will be then cleaned after

Function in facet.query like min,max

2012-01-17 Thread Eric Grobler
Hi Solr community, Is it possible to return the lowest, highest and average price of a search result using facets? I tried something like: facet.query={!max(price,0)} Is it possible and what is the correct syntax? q=htc android facet=true facet.query=price:[* TO 10] facet.query=price:[11 TO 100]

Re: Trying to understand SOLR memory requirements

2012-01-17 Thread Dave
Thank you Robert, I'd appreciate that. Any idea how long it will take to get a fix? Would I be better switching to trunk? Is trunk stable enough for someone who's very much a SOLR novice? Thanks, Dave On Mon, Jan 16, 2012 at 10:08 PM, Robert Muir rcm...@gmail.com wrote: looks like

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Daniel Bruegge
Evictions are 0 for all cache types. Your server max heap space with 12G is pretty huge. Which is good I think. The CPU on my server is a 8-Core Intel i7 965. Commit frequency is low, because shards are added and old shards exist for historical reasons. Old shards will be then cleaned after

How can I index this?

2012-01-17 Thread ahammad
Hello, I am looking into indexing two data sources. One of those is a standard website and the other is a Sharepoint site. The problem is that I have no direct database access. Normally I would just use the DIH and get what I need from the DB. I do have a java DAO (data access object) class that

first time query is very slow

2012-01-17 Thread gabriel shen
hi, I had an solr3.3 index of 200,000 documents, all text are stored and the total index size is 27gb. I used dismax query with over 10 qf and pf boosting field each, plus sorting on score and other 2 fields. It took quite a few seconds(5-8) for the first time query to return any result(no

Re: Trying to understand SOLR memory requirements

2012-01-17 Thread Robert Muir
I committed it already: so you can try out branch_3x if you want. you can either wait for a nightly build or compile from svn (http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). On Tue, Jan 17, 2012 at 8:35 AM, Dave dla...@gmail.com wrote: Thank you Robert, I'd appreciate that.

[Job] Sales Engineer at Lucid Imagination

2012-01-17 Thread Grant Ingersoll
Hi Solr Users, Lucid Imagination is looking for a sales engineer. If you know search, Solr and like working with customers, the sales engineer job may be of interest to you. I've included the job description below. If you are interested, please send your resume (off-list) to

Re: first time query is very slow

2012-01-17 Thread darren
First query will cause the index caches to be warmed up and this is why the first query takes some time. You can prewarm the caches with a query (when solr starts up) of your choosing in the config file. Google around the SolrWiki on cache/index warming. hth hi, I had an solr3.3 index of

PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi. At the moment I have a multivalued field where i would like to add information with gaps at the end of every line in the multivalued field and I would like to add gaps as well in the middle of the lines. For instance IBM Corporation some information *here a gap* more

PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi. At the moment I have a multivalued field where i would like to add information with gaps at the end of every line in the multivalued field and I would like to add gaps as well in the middle of the lines. For instance field name=CompaniesData type=text indexed=true stored=true

Re: first time query is very slow

2012-01-17 Thread gabriel shen
Thanks darren, I understand it will take longer time before warming up. What I am trying to find out is at the situation where we have no cache, why it will take so long time to complete the query, and what is the bottleneck? Fx, if I remove all qf, pf fields, the query speed will improve

Re: SolrJ Embedded

2012-01-17 Thread Erick Erickson
Quantify slower, does it matter? At issue is that usually Solr spends far more time doing the search than transmitting the query and response over HTTP. Http is not really slow *as a protocol* in the first place. The usual place people have problems here is when there are a bunch of requests made

Re: PositionIncrementGap inside a field

2012-01-17 Thread Erick Erickson
This is just adding the field repeatedly, something like doc field name=companiesdata IBM Corporation some information/field field name=companiesdatamore information/field field name=companiesdataIBM limited more info/field field name=companiesdataand some more data/field /doc The

Re: Solr Cloud Indexing

2012-01-17 Thread Erick Erickson
This only really makes sense if you don't have enough in-house resources to do your indexing locally, but it certainly is possible. Amazon's EC2 has been used, but really any hosting service should do. Best Erick On Tue, Jan 17, 2012 at 12:09 AM, Sujatha Arun suja.a...@gmail.com wrote: Would

Re: Function in facet.query like min,max

2012-01-17 Thread Erick Erickson
have you seen the Stats component? See: http://wiki.apache.org/solr/StatsComponent Best Erick On Tue, Jan 17, 2012 at 8:34 AM, Eric Grobler impalah...@googlemail.com wrote: Hi Solr community, Is it possible to return the lowest, highest and average price of a search result using facets? I

Re: How can I index this?

2012-01-17 Thread Erick Erickson
This sounds like, for the database source, that using SolrJ would be the way to go. Assuming you can access the database from Java this is pretty easy. As for the website, Nutch is certainly an option... But I'm a little puzzled. You mention a website, and sharepoint as your sources, then ask

Re: How can I index this?

2012-01-17 Thread ahammad
Perhaps I was a little confusing... Normally when I have DB access, I do a regular indexing process using DIH. For these two sources, I do not have direct DB access. I can only view the two sources like any end-user would. I do have a java class that can get the information that I need. That

Re: Function in facet.query like min,max

2012-01-17 Thread Eric Grobler
Yes, I have, but unfortunately it works on the whole index and not for a particular query. On Tue, Jan 17, 2012 at 3:37 PM, Erick Erickson erickerick...@gmail.comwrote: have you seen the Stats component? See: http://wiki.apache.org/solr/StatsComponent Best Erick On Tue, Jan 17, 2012 at

Re: PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi Erick. Thanks for your asnwer. This is almost what i want to do but my problem is that i want to be able to introduce two different sizes of gaps. Something like arr name=CompaniesData str IBM Corporation some information *gap of 30* more information *gap of 100* /str str

Re: How can I index this?

2012-01-17 Thread Erick Erickson
Well, if you can make an HTTP request, you can parse the return and stuff it into a SolrInputDocument in SolrJ and then send it to Solr. At least that seems possible if I'm understanding your setup. There are other Solr clients that allow similar processes, but the Java version is the one I know

Re: Function in facet.query like min,max

2012-01-17 Thread Erick Erickson
I don't believe that's the case, have you tried it? From the page I referenced: The stats component returns simple statistics for indexed numeric fields within the DocSet. And running a very quick test on the example data, I get different results when I used *:* and name:maxtor. That said, I'm

Re: PositionIncrementGap inside a field

2012-01-17 Thread Erick Erickson
Hmmm, no I don't know how to do that out of the box. Two things: 1 why do you want to do this? Perhaps if you describe the high-level problem you're trying to solve there might be other ways to approach it. 2 I *think* you could write your own Tokenizer that recognized the special

How to return the distance geo distance on solr 3.5 with bbox filtering

2012-01-17 Thread Maxim Veksler
Hello, I'm querying with bbox which should be faster then geodist, my queries are looking like this: http://localhost:8983/solr/select?indent=truefq={!bbox}sfield=locpt=39.738548,-73.130322d=100sort=geodist()%20ascq=trafficRouteId:235 the trouble is, that with bbox solr does not return the

Re: Sorting results within the fields

2012-01-17 Thread aronitin
It's been almost a week and there is no response to the question that I asked. Is the question has less details or there is no way to achieve the same in Lucene? -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-results-within-the-fields-tp3656049p3666983.html Sent

Re: Function in facet.query like min,max

2012-01-17 Thread Eric Grobler
Hi Erick Thanks for your feedback. I will try it tomorrow - if it works it will be perfect for my needs. Have a nice day Ericz On Tue, Jan 17, 2012 at 4:28 PM, Erick Erickson erickerick...@gmail.comwrote: I don't believe that's the case, have you tried it? From the page I referenced: The

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Daniel Bruegge
Ok, I have now changed the static warming in the solrconfig.xml using first- and newSearcher. Content is my field to facet on. Now the commits take longer, which is OK for me, but the searches are really faster right now. I also reduced the number of documents on my shards to 15mio/shard. So the

Re: Sorting results within the fields

2012-01-17 Thread Jan Høydahl
Hi, Complex problems like this is much better explained with concrete examples than generalized text. Please create a real example with real documents and their content, along with real queries. You don't explain what the score value which is generate by my application is - which application

Re: first time query is very slow

2012-01-17 Thread Yonik Seeley
On Tue, Jan 17, 2012 at 9:39 AM, gabriel shen xshco...@gmail.com wrote: For those customers who unluckily send un-prewarmed query, they will suffer from bad response time, it is not too pleasant anyway. The warming caches part isn't about unique queries, but more about caches used for sorting

Facet auto-suggest

2012-01-17 Thread Jon Drukman
I don't even know what to call this feature. Here's a website that shows the problem: http://pulse.audiusanews.com/pulse/index.php Notice that you can end up in a situation where there are no results. For example, in order, press: People, Performance, Technology, Photos. The client wants it so

Re: Facet auto-suggest

2012-01-17 Thread Jan Høydahl
Hi, Sure, you can use filters and facets for this. Start a query with ...facet.field=sourcefacet.field=topicsfacet.field=type When you click a button, you set the corresponding filter (fq=source:people), and the new query will return the same facets with new counts. In the Audi example, you

Re: Solr Cloud Indexing

2012-01-17 Thread Lance Norskog
Cloud upload bandwidth is free, but download bandwidth costs money. If you upload a lot of data but do not query it often, Amazon can make sense. You can also rent much cheaper hardware in other hosting services where you pay by the month or even by the year. If you know you have a cap on how

Re: Trying to understand SOLR memory requirements

2012-01-17 Thread Lance Norskog
Which version of Solr do you use? 3.1 and 3.2 had a memory leak bug in spellchecking. This was fixed in 3.3. On Tue, Jan 17, 2012 at 5:59 AM, Robert Muir rcm...@gmail.com wrote: I committed it already: so you can try out branch_3x if you want. you can either wait for a nightly build or compile

Re: Sorting results within the fields

2012-01-17 Thread aronitin
Hi Jan, Thanks for the reply. Here is the concrete explanation of the problem that I'm trying to solve. *SOLR Schema* Here is the definition of the SOLR schema *There are 3 dynamic fields* dynamicField name=*_conceptid type=text indexed=true stored=true multiValued=true termVectors=true

Re: Highlighting text field when query is for string field

2012-01-17 Thread solrdude
Just to be clear, I do phrase query on string field like q=keyword_text:smooth skin. I am expecting highlighting to be done on excerpt field. What I see is: lst name=highlighting lst name=18602-1973/ lst name=18603-1973/ lst name=18604-1973/ /lst These numbers are unique id's of documents. Where

Question on Reverse Indexing

2012-01-17 Thread Shyam Bhaskaran
Hi, For reverse indexing we are using the ReversedWildcardFilterFactory on Solr 4.0 filter class=solr.ReversedWildcardFilterFactory withOriginal=true maxPosAsterisk=3 maxPosQuestion=2 maxFractionAsterisk=0.33/ ReversedWildcardFilterFactory was helping us to perform leading wild card

Re: Question on Reverse Indexing

2012-01-17 Thread François Schiettecatte
Using ReversedWildcardFilterFactory will double the size of your dictionary (more or less), maybe the drop in performance that you are seeing is a result of that? François On Jan 17, 2012, at 9:01 PM, Shyam Bhaskaran wrote: Hi, For reverse indexing we are using the

RE: Question on Reverse Indexing

2012-01-17 Thread Shyam Bhaskaran
Hi Francois, I understand that disabling of ReversedWildcardFilterFactory has improved the performance. But I am puzzled over how the leading wild card search like *lock is working even though I have now disabled the ReversedWildcardFilterFactory and the indexes have been created without

Re: DataImportHandler in Solr 4.0

2012-01-17 Thread Rob
Not a java pro, and the documentation hasn't been updated to include these instructions (at least that I could find). What do I need to do to perform the steps that Alexandre is talking about? -- View this message in context:

Re: Can Apache Solr Handle TeraByte Large Data

2012-01-17 Thread Otis Gospodnetic
Could indexing English Wikipedia dump over and over get you there? Otis  Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html From: Memory Makers memmakers...@gmail.com To: solr-user@lucene.apache.org

Re: Solr - Tika(?) memory leak

2012-01-17 Thread Otis Gospodnetic
You'll need to reindex everything indeed. Otis  Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html From: Wayne W waynemailingli...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, January 17,