multivalued field using DIH

2014-03-27 Thread scallawa
I am using solr 4.7 and am importing data directly from a mysql database table using the DIH. I have a column that looks like similar to this below in that it has multiple values in the database. material cotton polyester blend rayon I would like the data to look like the following

Re: Solr 4.3.1 memory swapping

2014-03-27 Thread Shawn Heisey
On 3/26/2014 10:26 PM, Darrell Burgan wrote: Okay well it didn't take long for the swapping to start happening on one of our nodes. Here is a screen shot of the Solr console: https://s3-us-west-2.amazonaws.com/panswers-darrell/solr.png And here is a shot of top, with processes sorted by

Re: multivalued field using DIH

2014-03-27 Thread Shawn Heisey
On 3/27/2014 12:49 AM, scallawa wrote: I am using solr 4.7 and am importing data directly from a mysql database table using the DIH. I have a column that looks like similar to this below in that it has multiple values in the database. material cotton polyester blend rayon I

Re: FE Integration with JSON

2014-03-27 Thread Shawn Heisey
On 3/27/2014 2:11 AM, Bernhard Prange wrote: I am looking for a simple solution to construct a frontend search. The search provider just gave me a JSON Url. Anybody has a simple guide or some snippets for that? There are no details here. What specifically do you need help with? Presumably

AW: Indexing parts of an HTML file differently

2014-03-27 Thread Michael Clivot
Thanks for your answer Jack. @Gora: How are you fetching the HTML content, and indexing it into Solr? We are using SolR with the OpenText Delivery Server. The Delivery Server generated HTML representations of the published pages and writes them to the directory, which is used by solr to get

Re: FE Integration with JSON

2014-03-27 Thread Bernhard Prange
right :) Thanks Shawn. It is the Frontend of a Webpage. (HTML5). The search provider offers me an URL where I get a query result of solr (in JSON). That's what I have. What I need is a How to for the UI rendering of this file. (And the search query functionality). The SOLR Server is on a

Re: FE Integration with JSON

2014-03-27 Thread Alexandre Rafalovitch
Still not enough details. But let me try to understand: There is a third party provider. They are exposing Solr directly to the internet and you have a particular query that returns Solr results in JSON form. You want to know if there are libraries/components that will know how to parse that

Re: Indexing parts of an HTML file differently

2014-03-27 Thread Alexandre Rafalovitch
Can you get Delivery Server to generate Solr-style XML or JSON update file? Might be easier than generating and then re-parsing HTML? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu, Mar

dih data-config.xml onImportEnd event

2014-03-27 Thread Andreas Owen
i would like to call a url after the import is finished whith the event document onImportEnd=. how can i do this?

Re: Facetting by field then query

2014-03-27 Thread Alvaro Cabrerizo
I don't think you can do it, as pivot facetinghttp://wiki.apache.org/solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting doesn't let you use facet queries. The closer query I can imagine is: - q=sentence:bar OR sentence:foo - facet=true - facet.pivot=media_id,sentence At

Re: dih data-config.xml onImportEnd event

2014-03-27 Thread Alexandre Rafalovitch
I don't think there is one like that. But you might be able to use a custom UpdateRequestProcessor? Or a postCommit hook in solrconfig.xml Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu,

Re: dih data-config.xml onImportEnd event

2014-03-27 Thread Ahmet Arslan
Hi Andres, Here is a snippet you can use for starting point. import org.apache.solr.handler.dataimport.Context; import org.apache.solr.handler.dataimport.EventListener; public class MyEventListener implements EventListener {   public void onEvent(Context ctx) {     if

Re: dih data-config.xml onImportEnd event

2014-03-27 Thread Alexandre Rafalovitch
Oops. Ignore my email. I learnt something today that I have not seen anybody else use. Are there live open-source examples of the DIH EventListeners? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency

Re: MergingSolrIndexes not supported by SolrCloud?why?

2014-03-27 Thread rulinma
I use hdfs to test, that not work. I tried: (1) **/indexDir=hdfs://ip/solr/sample/data/index (2) **/indexDir=/solr/sample/data/index not work well. I also try: (3) **/srcCore=sample not work well. can give me some success sample. 3x! I insert data, hdfs appear index files,

facet doesnt display all possibilities after selecting one

2014-03-27 Thread Andreas Owen
when i select a facet in thema_f all the others in the group disapear but the other facets keep the original findings. it seems like it should work. maybe the underscore is the wrong char for the seperator? example documents in index doc arr name=thema_f str1_Produkte/str

dih data-config.xml onImportEnd event

2014-03-27 Thread Andreas Owen
i would like to call a url after the import is finished whith the event document onImportEnd=. how can i do this?

Re: Facetting by field then query

2014-03-27 Thread David Santamauro
For pivot facets in SolrCloud, see https://issues.apache.org/jira/browse/SOLR-2894 Resolution: Unresolved Fix Version/s 4.8 I am waiting patiently ... On 03/27/2014 05:04 AM, Alvaro Cabrerizo wrote: I don't think you can do it, as pivot

Re: dih data-config.xml onImportEnd event

2014-03-27 Thread Stefan Matheis
I would suggest you read the replies to your last mail (containing the very same question) first? -Stefan On Thursday, March 27, 2014 at 1:56 PM, Andreas Owen wrote: i would like to call a url after the import is finished whith the event document onImportEnd=. how can i do this?

Re: facet doesnt display all possibilities after selecting one

2014-03-27 Thread Yonik Seeley
On Thu, Mar 27, 2014 at 8:56 AM, Andreas Owen ao...@swissonline.ch wrote: when i select a facet in thema_f all the others in the group disapear OK, I see you're excluding filters tagged with thema_f when faceting on the thema_f field. str

Block until replication finishes

2014-03-27 Thread Fermin Silva
Hi, we are moving to native replication with SOLR 3.5.1. Because we want to control the replication from another program (a cron job), we decided to curl the slave to issue a fetchIndex command. The problem we have is that the curl returns immediately, while the replication still goes in the

Please remove this thread.

2014-03-27 Thread Baruch
Hello Admin,  Can you please remove this thread http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/93279 There is no reason to have this thread live. Please and thank you. Baruch!

Logging which client connected to Solr

2014-03-27 Thread Juha Haaga
Hello, I’m investigating the possibility of logging the username of the client who did the search on Solr along with the normal logging information. The username is in the basic auth headers of the request, and the access control is managed by an Apache instance proxying to Solr. Is there a

Re: Logging which client connected to Solr

2014-03-27 Thread Greg Walters
We do something similar and include the server's hostname in solr's response. To accomplish this you'll have to write a class that extends org.apache.solr.servlet.SolrDispatchFilter and put your custom class in place as the SolrRequestFilter in solr's web.xml. Thanks, Greg On Mar 27, 2014, at

[ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Trey Grainger
I'm excited to announce the final print release of *Solr in Action*, the newest Solr book by Manning publications covering through Solr 4.7 (the current version). The book is available for immediate purchase in print and ebook formats, and the *outline*, some *free chapters* as well as the *full

Re: Logging which client connected to Solr

2014-03-27 Thread Jeff Wartes
You could always just pass the username as part of the GET params for the query. Solr will faithfully ignore and log any parameters it doesn¹t recognize, so it¹d show up in your {lot of params}. That means your log parser would need more intelligence, and your client would have to pass in the

Timeout when deleting collections or aliases in Solr 4.6.1

2014-03-27 Thread Dave Seltzer
I'm trying to delete some data on a 12 node Solr cloud environment. The cluster is running Solr 4.6.1. When I try to delete an alias the collections api returns: org.apache.solr.common.SolrException: deletealias the collection time out:60s at

Re: [ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Mark Miller
Nice, Congrats! --  Mark Miller about.me/markrmiller On March 27, 2014 at 11:17:49 AM, Trey Grainger (solrt...@gmail.com) wrote: I'm excited to announce the final print release of *Solr in Action*, the newest Solr book by Manning publications covering through Solr 4.7 (the current version).

Re: Please remove this thread.

2014-03-27 Thread Shawn Heisey
On 3/27/2014 7:37 AM, Baruch wrote: Can you please remove this thread http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/93279 There is no reason to have this thread live. This is an Apache mailing list. Apache almost never honors requests to remove anything from its mailing

WordDelimiterFilterFactory splits up hyphenated terms although splitOnNumerics, generateWordParts and generateNumberParts are set to 0 (false)

2014-03-27 Thread Malte Hübner
I am using Solr 4.7 and have got a serious problem with WordDelimiterFilterFactory. WordDelimiterFilterFactory behaves different on hyphenated terms if they contain charaters (a-Z) or characters AND numbers. Splitting up hyphenated terms is deactivated in my configuration: *This is the

Re: Logging which client connected to Solr

2014-03-27 Thread Alexandre Rafalovitch
I assume you are passing extra info to Solr. Then you can write servletfilter to put it in NDC or MDC which can then be picked up by log4j config pattern. This approach is not Solr specific. Just usual servlet/log stuff. Regards, Alex On 27/03/2014 9:00 pm, Juha Haaga

timeAllowed query parameter not working?

2014-03-27 Thread Mario-Leander Reimer
Hi Solr users, currently I have some really long running user entered pure wildcards queries (like *??) , these are hogging the CPU for several minutes. So what I tried is setting the timeAllowed query parameter via the search handler in solrconfig.xml. But without any luck, the parameter

Re: Block until replication finishes

2014-03-27 Thread Chris W
Hi You can use the details command to check the status of replication. http://localhost:8983/solr/core_name/replication?command=details The command returns an xml output and look out for the isReplicating field in the output. Keep running the command in a loop until the flag becomes false.

Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Saurabh Agarwal
Can anyone help me please. Hi All, I am new to Solr and from initial reading i am quite convinced Solr will be of great help. Can anyone help in making that decision. Usecase: 1. I will have PDF,Word docs generated daily/weekly ( lot of them ) which kinds of get overwritten frequently. 2. I

Re: [ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Trey Grainger
Hi Philippe, Yes if you've purchased the eBook then the PDF is available now and the other formats (ePub and Kindle) are supposed to be available for download on April 8th. It's also worth mentioning that the eBook formats are all available for free with the purchase of the print book. Best

Re: Searching multivalue fields.

2014-03-27 Thread Jack Krupansky
Sounds good... for Lucene users, but for Solr users... sounds like a Jira is needed. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Wednesday, March 26, 2014 4:54 PM To: solr-user@lucene.apache.org ; kokatnur.vi...@gmail.com Subject: Re: Searching multivalue fields.

What are my options?

2014-03-27 Thread Software Dev
We have a collection named items. These are simply products that we sell. A large part of our scoring involves boosting on certain metrics for each product (amount sold, total GMS, ratings, etc). Some of these metrics are actually split across multiple tables. We are currently re-indexing the

Re: What are my options?

2014-03-27 Thread Jack Krupansky
Consider DataStax Enterprise - a true real-time database with rich search (Cassandra plus Solr). -- Jack Krupansky -Original Message- From: Software Dev Sent: Thursday, March 27, 2014 1:11 PM To: solr-user@lucene.apache.org Subject: What are my options? We have a collection named

Re: stored=true vs stored=false, in terms of storage

2014-03-27 Thread Jack Krupansky
You can consider DocValues as well. There you can control whether they ever use heap memory or only file space. See: https://cwiki.apache.org/confluence/display/solr/DocValues -- Jack Krupansky -Original Message- From: Pramod Negi Sent: Wednesday, March 26, 2014 1:27 PM To:

RE: Solr 4.3.1 memory swapping

2014-03-27 Thread Darrell Burgan
Thanks for the advice Shawn - gives me a direction to head. My next step is probably to update the operating system and the JVM to see if the behavior changes. If not, I'll pull in Red Hat support. Thanks, Darrell -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent:

Re: dih data-config.xml onImportEnd event

2014-03-27 Thread Andreas Owen
sorry, the previous conversation was started with a false email-address. On Thu, 27 Mar 2014 14:06:57 +0100, Stefan Matheis matheis.ste...@gmail.com wrote: I would suggest you read the replies to your last mail (containing the very same question) first? -Stefan On Thursday, March 27,

RE: timeAllowed query parameter not working?

2014-03-27 Thread Michael Ryan
Unfortunately the timeAllowed parameter doesn't apply to the part of the processing that makes wildcard queries so slow. It only applies to a later part of the processing when the matching documents are being collected. There's some discussion in the original ticket that implemented this

Stats Filter Exclusion Throwing Error

2014-03-27 Thread Harish Agarwal
I'm using the latest nightly build of 4.8 and testing this patch: https://issues.apache.org/jira/browse/SOLR-3177 using this set of fq / stats.field query params: fq={!tag=INTEGER_4}INTEGER_4:(2)stats.field={!ex=INTEGER_4}INTEGER_4 with Solr throwing the following error: ERROR - 2014-03-27

SOLR Cloud 4.6 - PERFORMANCE WARNING: Overlapping onDeckSearchers=2

2014-03-27 Thread Rishi Easwaran
All, I am running SOLR Cloud 4.6, everything looks ok, except for this warn message constantly in the logs. 2014-03-27 17:09:03,982 WARN [commitScheduler-15-thread-1] [] SolrCore - [index_shard16_replica1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2 2014-03-27 17:09:05,517 WARN

Re: DIH dataimport.properties Zulu time

2014-03-27 Thread Kiran J
Thank you for the response. This works if I invoke start.jar with java. In my usecase however, I need to invoke start.jar directly (consoleless service so that the user cannot close it accidentally). It doesnt pickup user.timezone property when done this way. Is it possible to do this using the

Re: [ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Jagat Singh
Many Congrats, 600+ pages can make me feel the tireless two years handwork behind it. On Fri, Mar 28, 2014 at 4:04 AM, Trey Grainger solrt...@gmail.com wrote: Hi Philippe, Yes if you've purchased the eBook then the PDF is available now and the other formats (ePub and Kindle) are supposed

Re: Multiple Languages in Same Core

2014-03-27 Thread Trey Grainger
In addition to the two approaches Liu Bo mentioned (separate core per language and separate field per language), it is also possible to put multiple languages in a single field. This saves you the overhead of multiple cores and of having to search across multiple fields at query time. The idea

Re: DIH dataimport.properties Zulu time

2014-03-27 Thread Kiran J
I figured it out. I use SQL Server, so this is my solution : propertyWriter dateFormat=*-MM-dd'T'HH:mm:ssXXX* type=SimplePropertiesWriter / In TSQL, this can be converted to a UTC date time using : CONVERT(datetimeoffset, '${dih.last_index_time}', 127) Refs:

Re: String Cast Error

2014-03-27 Thread Chris Hostetter
: I have a search that sorts on a boolean field. This search is pulling : the following error: java.lang.String cannot be cast to : org.apache.lucene.util.BytesRef. This is almost certainly another manifestation of SOLR-5920... https://issues.apache.org/jira/browse/SOLR-5920 -Hoss

Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Alexandre Rafalovitch
This feels somewhat backwards. It's very hard to extract Line-Number information out of MSWord and next to impossible from PDF. So, it's not whether the Solr is a good fit or not here is that maybe your whole architecture has a major issue. Can you do this/what you want by hand at least once? Down

Re: document level security filter solution for Solr

2014-03-27 Thread Philip Durbin
Yonik, your reply was incredibly helpful. Thank you very much! The join approach to document security you explained is somewhat similar to what I called Option 2 (ACL PostFilter) since permissions are stored in each document, but it's much simpler in that I'm not required to write, compile, and

Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Saurabh Agarwal
Thanks a lot Alex for your reply, Appreciate the same. So if i leave the line no part. 1. I guess putting pdf/word in solr for search can be done, These documents will go go in solr. 2. For search any automatic way to give a excel sheet or large search keywords to search for . ie i have 1000's

Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Alexandre Rafalovitch
1. You don't actually put PDF/Word into Solr. Instead, it is run through content and metadata extraction process and then index that. This is important because a computer does not understand what you are looking for when you open a PDF. It only understand whatever text is possible to extract. In

[RE-BALACE of Collection] Re-balancing of collection after adding nodes to clustered node

2014-03-27 Thread Debasis Jana
Hi, I found the email addresses from a slide-share @ http://www.slideshare.net/thelabdude/tjp-solr-webinar. It's very useful. We are developing SOLR search using CDH4 Cloudera and embedded SOLR 4.4.0-search-1.1.0. We created a Collection when the cluster had 2 slave nodes. Then two extra nodes

Re: Question on highlighting edgegrams

2014-03-27 Thread Software Dev
Certainly I am not the only user experiencing this? On Wed, Mar 26, 2014 at 1:11 PM, Software Dev static.void@gmail.com wrote: Is this a known bug? On Tue, Mar 25, 2014 at 1:12 PM, Software Dev static.void@gmail.com wrote: Same problem here:

Re: Question on highlighting edgegrams

2014-03-27 Thread Shalin Shekhar Mangar
Yes, there are known bugs with EdgeNGram filters. I think they are fixed in 4.4 See https://issues.apache.org/jira/browse/LUCENE-3907 On Fri, Mar 28, 2014 at 10:17 AM, Software Dev static.void@gmail.com wrote: Certainly I am not the only user experiencing this? On Wed, Mar 26, 2014 at

Product index schema for solr

2014-03-27 Thread Ajay Patel
Original Message Subject:Product index schema for solr Date: Fri, 28 Mar 2014 10:46:20 +0530 From: Ajay Patel apa...@officebeacon.com To: solr-user-ow...@lucene.apache.org Hi Solr user developers. i am new in the world of solr search engine. i have a