Schema design for parent child field

2013-06-29 Thread Sperrink
Good day, I'm seeking some guidance on how best to represent the following data within a solr schema. I have a list of subjects which are detailed to n levels. Each document can contain many of these subject entities. As I see it if this had been just 1 subject per document, dynamic fields would

increase search score of certain category only for certain keyword

2013-06-29 Thread winsu
Hi, Currently i've certain sample data : name : summer boot category : boot shoe name : snow boot category : boot shoe name : boot pant category : pants name : modern boot pant category : pants name : modern bootcut category : pants If the keyword search boot , how to make the item with

Re: FileDataSource vs JdbcDataSouce (speed) Solr 3.5

2013-06-29 Thread Ahmet Arslan
Hi Mike, You could try http://wiki.apache.org/solr/UpdateCSV  And make sure you commit at the very end. From: Mike L. javaone...@yahoo.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Saturday, June 29, 2013 3:15 AM Subject:

Http status 503 Error in solr cloud setup

2013-06-29 Thread Sagar Chaturvedi
Hi, I setup 2 solr instances on 2 different machines and configured 2 zookeeper servers on these machines also. When I start solr on both machines and try to access the solr web-admin then I get following error on browser - Http status 503 - server is shutting down When I setup a single

Re: Schema design for parent child field

2013-06-29 Thread Jack Krupansky
Both dynamic fields and multivalued fields are powerful Solr features that can be used to great effect, but only is used in moderation - a relatively small number of discrete values (e.g., a few dozens of strings.) Anything more complex and you are asking for trouble and creating a

Re: increase search score of certain category only for certain keyword

2013-06-29 Thread Jack Krupansky
Use the edismax query parser with a higher boost for category than name: qf=name category^10.0 Tune the boost as needed for your app. Make sure name and category have both text and string variants - use copyField. The string variant is good for facets, the text variant is good for keyword

Re: cores sharing an instance

2013-06-29 Thread Peyman Faratin
its the singleton pattern, where in my case i want an object (which is RAM expensive) to be a centralized coordinator of application logic. thank you On Jun 29, 2013, at 1:16 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: There is very little shared between multiple cores

Re: Solr 4.3.0 DIH problem with MySQL datetime being imported with time as 00:00:00

2013-06-29 Thread Bill Au
I just double check my config. We are using convertType=true. Someone else came up with the config so I am not sure why we are using it. I will try with it set to false to see if something else will break. Thanks for pointing that out. This is my first time using DIH. I really like what I

Re: cores sharing an instance

2013-06-29 Thread Roman Chyla
Cores can be reloaded, they are inside solrcore loader /I forgot the exact name/, and they will have different classloaders /that's servlet thing/, so if you want singletons you must load them outside of the core, using a parent classloader - in case of jetty, this means writing your own jetty

Re: Solr 4.3.0 DIH problem with MySQL datetime being imported with time as 00:00:00

2013-06-29 Thread Bill Au
Setting convertType=false does solve the datetime issue. But there are now other columns that were working before but not working now. Since I have already done some research into the datetime to date issue and not been able to find a solution, I think I will have to keep convertType set to

Re: broken links returned from solr search

2013-06-29 Thread Erick Erickson
What links? You haven't shown us what link you're clicking on that generates the 404 error. You might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Jun 28, 2013 at 2:04 PM, MA LIG mewa...@gmail.com wrote: Hello, I ran the solr example as described in

Re: documentCache not used in 4.3.1?

2013-06-29 Thread Erick Erickson
It's especially weird that the hit ratio is so high and you're not seeing anything in the cache. Are you perhaps soft committing frequently? Soft commits throw away all the top-level caches including documentCache I think Erick On Fri, Jun 28, 2013 at 7:23 PM, Tim Vaillancourt

Re: Improving performance to return 2000+ documents

2013-06-29 Thread Erick Erickson
Well, depending on how many docs get served from the cache the time will vary. But this is just ugly, if you can avoid this use-case it would be a Good Thing. Problem here is that each and every shard must assemble the list of 2,000 documents (just ID and sort criteria, usually score). Then the

Re: FileDataSource vs JdbcDataSouce (speed) Solr 3.5

2013-06-29 Thread Erick Erickson
Mike: One issue is that you're forcing all the work onto the Solr server, and single-threading to boot by using DIH. You can consider moving to a SolrJ model where you can have N clients sending data to Solr if you can partition the data up amongst the N clients cleanly. FWIW, Erick On Sat,

Re: cores sharing an instance

2013-06-29 Thread Erick Erickson
Well, the code is all in the same JVM, so there's no reason a singleton approach wouldn't work that I can think of. All the multithreaded caveats apply. Best Erick On Fri, Jun 28, 2013 at 3:44 PM, Peyman Faratin pey...@robustlinks.comwrote: Hi I have a multicore setup (in 4.3.0). Is it

Re: broken links returned from solr search

2013-06-29 Thread gilawem
Sorry, i thought it was obvious. The links that are broken are the links that are returned in the search results. Using the example in the documentation I mentioned below, to load a word doc via curl http://localhost:8983/solr/update/extract?literal.id=doc1commit=true; -F

Re: Solr 4.3.0 DIH problem with MySQL datetime being imported with time as 00:00:00

2013-06-29 Thread Bill Au
So disabling convertType does provide a workaround for my problem with datetime column. But the problem still exists when convertType is enabled because DIH is not doing the conversion correctly for a solr date field. Solr date field does have a time portion but java.sql.Date does not. So DIH

Re: Solr 4.3.0 DIH problem with MySQL datetime being imported with time as 00:00:00

2013-06-29 Thread Shalin Shekhar Mangar
Yes we need to use getTimestamp instead of getDate. Please create an issue. On Sat, Jun 29, 2013 at 11:48 PM, Bill Au bill.w...@gmail.com wrote: So disabling convertType does provide a workaround for my problem with datetime column. But the problem still exists when convertType is enabled

RE: documentCache not used in 4.3.1?

2013-06-29 Thread Vaillancourt, Tim
Yes, we are softCommit'ing every 1000ms, but that should be enough time to see metrics though, right? For example, I still get non-cumulative metrics from the other caches (which are also throw away). I've also curl/sampled enough that I probably should have seen a value by now. If anyone else

Re: broken links returned from solr search

2013-06-29 Thread Erick Erickson
There's nothing built into the indexing process that stores URLs allowing you to fetch the document, you have to do that yourself. I'm not sure how the link is getting into the search results, you're assigning doc1 as the ID of the doc, and I think the browse request handler, aka Solaritas is

Re: documentCache not used in 4.3.1?

2013-06-29 Thread Erick Erickson
Tim: Yeah, this doesn't make much sense to me either since, as you say, you should be seeing some metrics upon occasion. But do note that the underlying cache only gets filled when getting documents to return in query results, since there's no autowarming going on it may come and go. But you can

Re: Solr 4.3.0 DIH problem with MySQL datetime being imported with time as 00:00:00

2013-06-29 Thread Bill Au
https://issues.apache.org/jira/browse/SOLR-4978 On Sat, Jun 29, 2013 at 2:33 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes we need to use getTimestamp instead of getDate. Please create an issue. On Sat, Jun 29, 2013 at 11:48 PM, Bill Au bill.w...@gmail.com wrote: So

Re: Improving performance to return 2000+ documents

2013-06-29 Thread Peter Sturge
Hello Utkarsh, This may or may not be relevant for your use-case, but the way we deal with this scenario is to retrieve the top N documents 5,10,20or100 at a time (user selectable). We can then page the results, changing the start parameter to return the next set. This allows us to 'retrieve'

Re: Varnish

2013-06-29 Thread William Bell
OK. Here is the answer for us. Here is a sample default.vcl. We are validating the LastModified ( if (!beresp.http.last-modified) ) is changing when the core is indexed and the version changes of the index. This does 10 minutes caching and a 1hr grace period (if solr is down, it will deliver

Re: Varnish

2013-06-29 Thread William Bell
On a large website, by putting 1 varnish in front of all 4 SOLR boxes we were able to trim 25% off the load time (TTFB) of the page. Our hit ratio was between 55 and 75%. We gave varnish 24GB of RAM, and was not able to fill it under full load with a 10 minute cache timeout. We get about 2.4M

Re: Http status 503 Error in solr cloud setup

2013-06-29 Thread Lance Norskog
I do not know what causes the error. This setup will not work. You need one or three zookeepers. SolrCloud demands that a majority of the ZK servers agree. If you have two ZKs this will not work. On 06/29/2013 05:47 AM, Sagar Chaturvedi wrote: Hi, I setup 2 solr instances on 2 different

Re: Varnish

2013-06-29 Thread Lance Norskog
Solr HTTP caching also support e-tags. These are unique keys for the output of a query. If you send a query twice, and the index has not changed, the return will be the same. The e-tag is generated from the query string and the index generation number. If Varnish supports e-tags, you can keep

Re: documentCache not used in 4.3.1?

2013-06-29 Thread Tim Vaillancourt
That's a good idea, I'll try that next week. Thanks! Tim On 29/06/13 12:39 PM, Erick Erickson wrote: Tim: Yeah, this doesn't make much sense to me either since, as you say, you should be seeing some metrics upon occasion. But do note that the underlying cache only gets filled when getting

Re: broken links returned from solr search

2013-06-29 Thread gilawem
OK thanks. So I guess I will set up my own normal webserver and have the solr server a sort of private web-based API (or possibly a front-end that, when a user clicks on a search result link, just redirects the user to my normal web server that has the related file). That's easy enough. If