JSON Facet Analytics API in Solr 5.1

2015-04-14 Thread Yonik Seeley
Folks, there's a new JSON Facet API in the just released Solr 5.1 (actually, a new facet module under the covers too). It's marked as experimental so we have time to change the API based on your feedback. So let us know what you like, what you would change, what's missing, or any other ideas you

Re: Facet sorting algorithm for index

2015-04-02 Thread Yonik Seeley
On Thu, Apr 2, 2015 at 6:36 AM, yriveiro yago.rive...@gmail.com wrote: Hi, I have an external application that use the output of a facet to join other dataset using the keys of the facet result. The facet query use index sort but in some point, my application crash because the order of the

Re: Facet sorting algorithm for index

2015-04-02 Thread Yonik Seeley
On Thu, Apr 2, 2015 at 9:44 AM, Yago Riveiro yago.rive...@gmail.com wrote: Where can I found the source code used in index sorting? I need to ensure that the external data has the same sorting that the facet result. If you step over the indexed terms of a field you get them in sorted order

Re: sort on facet.index?

2015-04-02 Thread Yonik Seeley
On Thu, Apr 2, 2015 at 10:25 AM, Ryan Josal rjo...@gmail.com wrote: Sorting the result set or the facets? For the facets there is facet.sort=index (lexicographically) and facet.sort=count. So maybe you are asking if you can sort by index, but reversed? I don't think this is possible, and

Re: How to create a core by API?

2015-03-26 Thread Yonik Seeley
On Thu, Mar 26, 2015 at 1:45 PM, Mark E. Haase meha...@gmail.com wrote: I'm not saying you're wrong. The configSet parameter doesn't work at all in my set up, so you might be right... I'm just wondering where that's documented. Trying on current trunk, I got it to work:

Re: schemaless slow indexing

2015-03-23 Thread Yonik Seeley
On Mon, Mar 23, 2015 at 1:54 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: I looked at SOLR-7290, but I think the discussion should stay on the mailing list for at least one more iteration. My understanding that the reason copyField exists is so that a search actually worked out of the

Re: schemaless slow indexing

2015-03-22 Thread Yonik Seeley
I took a quick look at the stock schemaless configs... unfortunately they contain a performance trap. There's a copyField by default that copies *all* fields to a catch-all field called _text. IMO, that's not a great default. Double the index size (well, the index portion of it at least... not

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Yonik Seeley
The document cache is not really going to be taking up time here. How many concurrent requests (threads) are you testing with here? One thing I've seen over the years is a false sense of what is taking up time when benchmarks with a lot of threads are used. The reason is that when there are a

Re: Facet pivot sorting while combining Stats Component With Pivots in Solr 5

2015-03-19 Thread Yonik Seeley
On Fri, Mar 13, 2015 at 1:43 PM, Dominique Bejean dominique.bej...@eolya.fr wrote: Thank you for the response This is something Heliosearch can do. Ionic Seeley, created a JIRA ticket to back port this feature to Solr 5. Oh, I'm charged now, am I? ;-) I'ts been committed, and will be in

Re: Solr tlog and soft commit

2015-03-15 Thread Yonik Seeley
Your basic assumptions about the underlying mechanisms are incorrect. The size of the index has nothing to do with the transaction logs... and transaction logs are never written to index except in recovery. You would see the same index size behavior w/o transaction logs, and it has to do with some

Re: Solr tlog and soft commit

2015-03-15 Thread Yonik Seeley
On Sun, Mar 15, 2015 at 12:09 PM, Erick Erickson erickerick...@gmail.com wrote: 1 Well, probably not. Hate to be confusing here, but if your ramBufferSizeMB setting is exceeded, then internal buffers will be flushed to the currently open segment in the index directory. It's even more

Re: backport Heliosearch features to Solr

2015-03-09 Thread Yonik Seeley
at the bottom. https://docs.google.com/spreadsheets/d/1uZ2qgOaKx1ZxJ_NKwj2zIAYFQ9fp8OrEPI5hqadcPeY/ -Yonik On Sun, Mar 1, 2015 at 4:50 PM, Yonik Seeley ysee...@gmail.com wrote: As many of you know, I've been doing some work in the experimental heliosearch fork of Solr over the past year. I think it's

backport Heliosearch features to Solr

2015-03-01 Thread Yonik Seeley
As many of you know, I've been doing some work in the experimental heliosearch fork of Solr over the past year. I think it's time to bring some more of those changes back. So here's a poll: Which Heliosearch features do you think should be brought back to Apache Solr? http://bit.ly/1E7wi1Q

Re: backport Heliosearch features to Solr

2015-03-01 Thread Yonik Seeley
://sematext.com/ On Sun, Mar 1, 2015 at 4:50 PM, Yonik Seeley ysee...@gmail.com wrote: As many of you know, I've been doing some work in the experimental heliosearch fork of Solr over the past year. I think it's time to bring some more of those changes back. So here's a poll: Which

Re: AND query not working on stopwords as expected

2015-02-16 Thread Yonik Seeley
On Mon, Feb 16, 2015 at 4:32 PM, Arun Rangarajan arunrangara...@gmail.com wrote: [...] This query q=name:ofrows=0 gives no results as expected. However, this query: q=name:of AND all_class_ids:(371)rows=0 gives results and is equal to the same number of results as

Re: Query always fail if row value is too high

2015-02-09 Thread Yonik Seeley
Hmmm, that's interesting... It looks like a container (jetty/tomcat or whatever) configuration limit somewhere. I'd only expect this error from Solr when trying to send something really large though - notice upload in the error. Is this error message really from Solr or another piece of your

Re: Deep paging in solr using cursorMark

2015-01-27 Thread Yonik Seeley
On Tue, Jan 27, 2015 at 3:29 AM, CKReddy Bhimavarapu chaitu...@gmail.com wrote: Hi, Using CursorMark we over come the Deep paging so far so good. As far as I understand cursormark unique for each and every query depending on sort values other than unique id and also depends up on number

Re: leader split-brain at least once a day - need help

2015-01-08 Thread Yonik Seeley
It's worth noting that those messages alone don't necessarily signify a problem with the system (and it wouldn't be called split brain). The async nature of updates (and thread scheduling) along with stop-the-world GC pauses that can change leadership, cause these little windows of inconsistencies

Re: Loading data to FieldValueCache

2014-12-29 Thread Yonik Seeley
On Fri, Dec 26, 2014 at 12:26 PM, Erick Erickson erickerick...@gmail.com wrote: I don't know the complete algorithm, but if the number of docs that satisfy the fq is small enough, then just the internal Lucene doc IDs are stored rather than a bitset. If smaller than maxDoc/64 ids are

Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread Yonik Seeley
On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote: OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are configured in such a way that they consider ^ to be an illegal character. Hmmm, the stack trace in SOLR-5971 shows a different user (who gets

Re: Details on why ConccurentUpdateSolrServer is reccommended for maximum index performance

2014-12-11 Thread Yonik Seeley
On Thu, Dec 11, 2014 at 11:52 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: On 11 December 2014 at 11:40, Yonik Seeley yo...@heliosearch.com wrote: So to Solr (server side), it looks like a single update request (assuming 1 thread) with a batch of multiple documents... but it was never

Re: How to stop Solr tokenising search terms with spaces

2014-12-09 Thread Yonik Seeley
[mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 17:58 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher erik.hatc...@gmail.com wrote: debug output tells a lot. Looks like

Re: How to stop Solr tokenising search terms with spaces

2014-12-08 Thread Yonik Seeley
...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I

Re: How to stop Solr tokenising search terms with spaces

2014-12-08 Thread Yonik Seeley
On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher erik.hatc...@gmail.com wrote: debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. Actually, it looks like it is, but you're not

Re: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Yonik Seeley
On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type

Re: How to stop Solr tokenising search terms with spaces

2014-12-06 Thread Yonik Seeley
On Sat, Dec 6, 2014 at 7:17 PM, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? Simple phrase queries: q= field1:Hanks Major Phrase queries with wildcards / partial matches are a different story... they are complex:

[ANN] Heliosearch 0.09 (JSON Request API + Distrib for Facet API)

2014-12-05 Thread Yonik Seeley
http://heliosearch.org/download Heliosearch v0.09 Features: o Heliosearch v0.09 is based on (and contains all features of) Lucene/Solr 4.10.2 + most of 4.10.3 o Distributed search support for the new faceted search module / JSON Facet API: http://heliosearch.org/json-facet-api/ o Automatic

Re: TrieLongField not store large longs correctly

2014-11-27 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 10:38 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Looks like one of these: http://stackoverflow.com/questions/1379934/large-numbers-erroneously-rounded-in-javascript Yeah, that's what Brendan pointed to earlier in this thread. In the UI code, we just seem to be

Re: cross site scripting

2014-11-26 Thread Yonik Seeley
It would have been helpful if you would have pointed out exactly what you think the problem is. I still don't see an issue, since it doesn't look like any encapsulation has been broken. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Wed, Nov

Re: cross site scripting

2014-11-26 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 10:47 AM, Lee Carroll lee.a.carr...@googlemail.com wrote: The applications using the data may write solr data to the dom. (I doubt they do but they could now or in the future. They have an expectation of trusting the data back from solr). As a straight forward attack

Re: cross site scripting

2014-11-26 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 11:41 AM, Lee Carroll lee.a.carr...@googlemail.com wrote: Just out of interest, what is the use-case for a pseudo-field whose value is a repeat of the field name? Not having to specify a field name for the function query: fl=add(x,y) somes back as (for example)

Re: TrieLongField not store large longs correctly

2014-11-26 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 7:10 PM, Erick Erickson erickerick...@gmail.com wrote: This is very weird, someone want to check this out to insure that I'm not hallucinating? I just tried the following in Heliosearch, since I had it open (based on 4.10.x): @Test public void testWeird() throws

Re: TrieLongField not store large longs correctly

2014-11-26 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 7:30 PM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, this seems to be browser related because if I use curl or Safari, the return and display are fine. i.e. curl http://localhost:8983/solr/collection1/query?q=*:* displays: eoe_tl:20140716126615472,

Re: TrieLongField not store large longs correctly

2014-11-26 Thread Yonik Seeley
On Wed, Nov 26, 2014 at 7:57 PM, Brendan Humphreys bren...@canva.com wrote: I'd wager this is a loss of precision caused by Javascript rounding in the admin client. More details here: http://stackoverflow.com/questions/1379934/large-numbers-erroneously-rounded-in-javascript Ah, indeed - I was

Re: TrieLongField not store large longs correctly

2014-11-26 Thread Yonik Seeley
:02, Yonik Seeley yo...@heliosearch.com wrote: On Wed, Nov 26, 2014 at 7:57 PM, Brendan Humphreys bren...@canva.com wrote: I'd wager this is a loss of precision caused by Javascript rounding in the admin client. More details here: http://stackoverflow.com/questions/1379934/large-numbers

Re: IndexSearcher not being closed

2014-11-20 Thread Yonik Seeley
On Wed, Nov 19, 2014 at 8:37 AM, Priya Rodrigues roddied...@gmail.com wrote: public void setContext( TransformContext context ) { try { IndexReader reader = qparser.getReq().getSearcher().getIndexReader(); -Refcount incremented You can get a searcher from the request as many times

Re: Solr JOIN: keeping permission data out of primary documents

2014-11-19 Thread Yonik Seeley
On Tue, Nov 18, 2014 at 3:47 PM, Philip Durbin philip_dur...@harvard.edu wrote: Solr JOINs are a way to enforce simple document security, as explained by Yonik Seeley at http://lucene.472066.n3.nabble.com/document-level-security-filter-solution-for-Solr-tp4126992p4126994.html I'm trying

Re: Solr JOIN: keeping permission data out of primary documents

2014-11-19 Thread Yonik Seeley
On Wed, Nov 19, 2014 at 9:22 AM, Philip Durbin philip_dur...@harvard.edu wrote: On Wed, Nov 19, 2014 at 5:45 AM, Yonik Seeley yo...@heliosearch.com wrote: On Tue, Nov 18, 2014 at 3:47 PM, Philip Durbin philip_dur...@harvard.edu wrote: Solr JOINs are a way to enforce simple document security

Re: DocSet getting cached in filterCache for facet request with {!cache=false}

2014-11-11 Thread Yonik Seeley
On Tue, Nov 11, 2014 at 1:25 PM, Mohsin Beg Beg mohsin@oracle.com wrote: Wiki says fq={!cache=false}*:* is ok, no? That's for the filtering... not for the faceting. then how to skip filterCache for facet.method=enum ? Specify a high minDF (the min docfreq or number of documents that need

Re: Solr 4.10 very slow on build()

2014-11-08 Thread Yonik Seeley
Try commenting out the suggester component handler in solrconfig.xml: https://issues.apache.org/jira/browse/SOLR-6679 -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Sat, Nov 8, 2014 at 2:03 PM, Mohsen Saboorian mohs...@gmail.com wrote: I

Re: order of updates

2014-11-03 Thread Yonik Seeley
On Mon, Nov 3, 2014 at 8:53 AM, Matteo Grolla matteo.gro...@gmail.com wrote: HI, can anybody give me a confirm? If I add multiple document with the same id but differing on other fields and then issue a commit (no commits before this) the last added document gets indexed, right?

Re: Solr slow start up (tlog is small)

2014-11-03 Thread Yonik Seeley
Can you tell from the logs what Solr is doing during that time? Do you have any warming queries configured? Also see this: https://issues.apache.org/jira/browse/SOLR-6679 (comment out suggester related stuff if you aren't using it) -Yonik http://heliosearch.org - native code faceting, facet

Re: Solr slow startup

2014-11-03 Thread Yonik Seeley
One possible cause of a slow startup with the default configs: https://issues.apache.org/jira/browse/SOLR-6679 -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Mon, Nov 3, 2014 at 11:05 AM, Michal Krajňanský michal.krajnan...@gmail.com wrote:

Re: [ANN] Heliosearch 0.08 released

2014-10-28 Thread Yonik Seeley
Am 27.10.2014 um 17:25 schrieb Yonik Seeley: http://heliosearch.org/download Heliosearch v0.08 Features: o Heliosearch v0.08 is based on (and contains all features of) Lucene/Solr 4.10.2 o Streaming Aggregations over search results API: http://heliosearch.org/streaming-aggregation

[ANN] Heliosearch 0.08 released

2014-10-27 Thread Yonik Seeley
http://heliosearch.org/download Heliosearch v0.08 Features: o Heliosearch v0.08 is based on (and contains all features of) Lucene/Solr 4.10.2 o Streaming Aggregations over search results API: http://heliosearch.org/streaming-aggregation-for-solrcloud/ o Optimized request logging, and

Re: recip function error

2014-10-23 Thread Yonik Seeley
On Thu, Oct 23, 2014 at 7:47 PM, Michael Sokolov msoko...@safaribooksonline.com wrote: 3.16e-11.0 looks fishy to me Indeed... looks like it should be 3.16e-11 Standard scientific notation shouldn't have decimal points in the exponent. Not sure if that causes Java problems or not though...

Re: SOLR Boolean clause impact on memory/Performance

2014-10-14 Thread Yonik Seeley
A terms query will be better than a boolean query here (assuming you don't care about scoring those terms): http://heliosearch.org/solr-terms-query/ But you need a recent version of Solr or Heliosearch. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap

Re: Payload with Local Params?

2014-10-11 Thread Yonik Seeley
On Sat, Oct 11, 2014 at 12:22 AM, William Bell billnb...@gmail.com wrote: I want to call: http://localhost:8983/solr/collection1/query?defType=myqpyy=electronicsq=payloads:$yy How do I pass $yy to the parser and have it use electronics instead of the literal $yy? Solr only does parameter

Re: Solr Index to Helio Search

2014-10-09 Thread Yonik Seeley
Hmmm, I imagine this is due to the lucene back compat bugs that were in 4.10, and the fact that the last release of heliosearch was branched off of the 4x branch. I just tried moving an index back and forth between my local heliosearch copy and solr 4.10.1 and things worked fine. Here's the

Re: queryResultCache's size is not increasing

2014-10-07 Thread Yonik Seeley
It's your full-import every 5 minutes. A queryResultCache will be invalidated by changes to the index (i.e. a commit) and the size will drop back to 0. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Tue, Oct 7, 2014 at 4:53 AM, Lee Chunki

Re: Question about filter cache size

2014-10-03 Thread Yonik Seeley
On Fri, Oct 3, 2014 at 3:42 PM, Peter Keegan peterlkee...@gmail.com wrote: Say I have a boolean field named 'hidden', and less than 1% of the documents in the index have hidden=true. Do both these filter queries use the same docset cache size? : fq=hidden:false fq=!hidden:true Nope...

Re: Question about filter cache size

2014-10-03 Thread Yonik Seeley
On Fri, Oct 3, 2014 at 4:35 PM, Shawn Heisey apa...@elyograg.org wrote: On 10/3/2014 1:57 PM, Yonik Seeley wrote: On Fri, Oct 3, 2014 at 3:42 PM, Peter Keegan peterlkee...@gmail.com wrote: Say I have a boolean field named 'hidden', and less than 1% of the documents in the index have hidden

Re: Question about filter cache size

2014-10-03 Thread Yonik Seeley
On Fri, Oct 3, 2014 at 6:38 PM, Peter Keegan peterlkee...@gmail.com wrote: it will be cached as hidden:true and then inverted Inverted at query time, so for best query performance use fq=hidden:false, right? Yep. -Yonik http://heliosearch.org - native code faceting, facet functions,

Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread Yonik Seeley
On Sat, Sep 27, 2014 at 2:52 PM, White, Bill bwh...@ptfs.com wrote: Hello, I've attempted to figure this out from reading the documentation but without much luck. I looked for a comprehensive query syntax specification (e.g., with BNF and a list of operator semantics) but I'm unable to find

Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread Yonik Seeley
or green, but do NOT match ANY other color. I can probably drop the requirement related to having no color. On Sat, Sep 27, 2014 at 3:28 PM, Yonik Seeley yo...@heliosearch.com wrote: On Sat, Sep 27, 2014 at 2:52 PM, White, Bill bwh...@ptfs.com wrote: Hello, I've attempted to figure this out

Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread Yonik Seeley
On Sat, Sep 27, 2014 at 3:46 PM, White, Bill bwh...@ptfs.com wrote: Hmm, that won't work since color is free-form. Is there a way to invoke (via fq) a user-defined function (hopefully defined as part of the fq syntax, but alternatively, written in Java) and have it applied to the resultset?

Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread Yonik Seeley
Heh... very clever, Mikhail! -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Sat, Sep 27, 2014 at 4:43 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: indeed! the exclusive range {green TO red} matches to the lemon yellow hence, the

Re: Does soft commit block on autowarming?

2014-09-24 Thread Yonik Seeley
On Wed, Sep 24, 2014 at 6:56 PM, Bruce Johnson br...@fullstory.com wrote: Is it reliably true that once a soft commit request returns, any subsequent queries will hit a new (and autowarmed) searcher? Yes. The default for commit and softCommit commands is waitSearcher=true, which will not return

[ANN] Heliosearch 0.07 released

2014-09-07 Thread Yonik Seeley
http://heliosearch.org/download Heliosearch v0.07 Features o Heliosearch v0.07 is based on (and contains all features of) Lucene/Solr 4.10.0 o An optimized Terms Query with native code performance enhancements for efficiently matching multiple terms in a field.

Re: Performance of Boolean query with hundreds of OR clauses.

2014-09-07 Thread Yonik Seeley
Solr 4.10 has added a {!terms} query that should speed up these cases. Benchmarks here: http://heliosearch.org/solr-terms-query/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Tue, Aug 19, 2014 at 2:57 PM, SolrUser1543 osta...@gmail.com

Re: Custom stat aggregation and sorting function in solr

2014-08-17 Thread Yonik Seeley
On Sun, Aug 17, 2014 at 2:35 AM, dhimant dhimant84.jays...@gmail.com wrote: I want to add a new stat function (UniqueUsers(fieldName) like add/avg function already available in Solr) to find the unique across searched Solr records. Heliosearch (a solr fork) has this:

Re: Custom stat aggregation and sorting function in solr

2014-08-17 Thread Yonik Seeley
On Sun, Aug 17, 2014 at 7:27 AM, dhimant dhimant84.jays...@gmail.com wrote: Hi Yonik, Thanks for the reply. But i want a unique function on my binary column. This column contains binary representation of java hashset. Ah, got it... hopefully it's your own binary format and not Java

Re: Syntax unavailable for parameter substitution Solr 3.5

2014-08-16 Thread Yonik Seeley
You can't do this with stock solr, but a generic templating ability is now in heliosearch (a fork of solr): http://heliosearch.org/solr-query-parameter-substitution/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Fri, Aug 15, 2014 at 5:46 AM,

Re: content-type for json is not right

2014-08-09 Thread Yonik Seeley
It's configurable: https://issues.apache.org/jira/browse/SOLR-1123 It has been text/plain since v1.0 by default (so it will render in browsers) - perhaps you just never noticed? -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Sat, Aug 9, 2014

Re: Solr vs ElasticSearch

2014-08-04 Thread Yonik Seeley
On Mon, Aug 4, 2014 at 2:43 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: That resource is rather superficial. I wouldn't make big decision based on it. Agree. It's also somewhat biased given the environment in which it grew. ES advocates were all over stuff like that, but Solr advocates

Re: Stand alone Solr - no zookeeper?

2014-08-04 Thread Yonik Seeley
On Fri, Aug 1, 2014 at 10:48 AM, Joel Cohen jco...@grubhub.com wrote: The only thing so far that I see as a hurdle here is the data set size vs. heap size. If the index grows too large, then we have to increase the heap size, which could lead to longer GC times. Servers could pop in and out of

Re: Search results inconsistency when using joins

2014-07-29 Thread Yonik Seeley
The join qparser has no fq parameter, so that is ignored. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Tue, Jul 29, 2014 at 12:12 PM, heaven aheave...@gmail.com wrote: _query_:{!join from=profile_ids_im to=id_i v=$qTweet107001860

Re: stats.facet with multi-valued field in Solr 4.9

2014-07-21 Thread Yonik Seeley
On Mon, Jul 21, 2014 at 7:09 AM, Nico Kaiser n...@kaiser.me wrote: After the upgrade to Solr 4.9 (from 3.6) this seems not to be possible anymore: Stats can only facet on single-valued fields, not: instrumentIds https://issues.apache.org/jira/browse/SOLR-3642 It looks like perhaps it never

Re: stats.facet with multi-valued field in Solr 4.9

2014-07-21 Thread Yonik Seeley
On Mon, Jul 21, 2014 at 7:32 AM, Nico Kaiser n...@kaiser.me wrote: Yonik, thanks for your reply! I also found https://issues.apache.org/jira/browse/SOLR-1782 which also sees to deal with this, but I did not find out wether there is a workaround. For our use case the previous behaviour was

Re: faceting within facets

2014-07-21 Thread Yonik Seeley
On Mon, Jul 21, 2014 at 8:08 AM, David Flower dflo...@amplience.com wrote: Is it possible to create a facet within another facet in a single query For simple field facets, there's pivot faceting. For more complex nested facets, there are sub-facets in heliosearch (a solr fork):

Re: Performance issues with facets and filter query exclusions

2014-07-18 Thread Yonik Seeley
On Fri, Jul 18, 2014 at 2:10 PM, Hayden Muhl haydenm...@gmail.com wrote: I was doing some performance testing on facet queries and I noticed something odd. Most queries tended to be under 500 ms, but every so often the query time jumped to something like 5000 ms.

Re: problem with replication/solrcloud - getting 'missing required field' during update intermittently (SOLR-6251)

2014-07-17 Thread Yonik Seeley
On Wed, Jul 16, 2014 at 10:20 PM, Nathan Neulinger nn...@neulinger.org wrote: [{id:4b2c4d09-31e2-4fe2-b767-3868efbdcda1,channel: {add: preet},channel: {add: adam}}] Look at the JSON... It's trying to add two channel array elements... Should have been: [...] From what I'm reading on JSON -

Re: TrieDateField, precisionStep impact on sorting performance

2014-07-16 Thread Yonik Seeley
On Wed, Jul 16, 2014 at 5:51 AM, Kuehn, Dennis dennis.ku...@brands4friends.de wrote: I'd like to sort on a TrieDateField which currently has a precisionStep value of 6. From what I got so far, the precisionStep value only affects range query performance and index size. However, the

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Yonik Seeley
wants to help out, the github issue is here: https://github.com/Heliosearch/heliosearch/issues/13 -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Thursday, June 19, 2014 3:46 PM, Yonik Seeley yo...@heliosearch.com wrote: FYI, for those

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Yonik Seeley
On Fri, Jun 20, 2014 at 10:15 AM, Yago Riveiro yago.rive...@gmail.com wrote: Yonik, This native code uses in any way the docValues? Nope... not yet. It is something I think we should look into in the future though. In the past I was forced to indexed a big portion of my data with docValues

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Yonik Seeley
On Fri, Jun 20, 2014 at 11:16 AM, Floyd Wu floyd...@gmail.com wrote: Will these awesome features being implemented in Solr soon 2014/6/20 下午10:43 於 Yonik Seeley yo...@heliosearch.com 寫道: Given the current makeup of the joint Lucene/Solr PMC, it's unclear. I'm not worrying about that for now

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Yonik Seeley
On Fri, Jun 20, 2014 at 12:36 PM, Floyd Wu floyd...@gmail.com wrote: Hi Yonik, i dont' understand the relationship between solr and heliosearch since you were committer of solr? Heliosearch is a Solr fork that will hopefully find it's way back to the ASF in the future. Here's the original

[ANN] Heliosearch 0.06 released, native code faceting

2014-06-19 Thread Yonik Seeley
FYI, for those who want to try out the new native code faceting, this is the first release containing it (for single valued string fields only as of yet). http://heliosearch.org/download/ Heliosearch v0.06 Features: o Heliosearch v0.06 is based on (and contains all features of) Lucene/Solr

Re: ANN: Solr Next

2014-06-09 Thread Yonik Seeley
On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote: [...] Next major feature: Native Code Optimizations. In addition to moving more large data structures off-heap(like UnInvertedField?), I am planning to implement native code optimizations for certain hotspots. Native code

Re: Is the act of *caching* an fq very expensive? (seems to cost 4 seconds in my example)

2014-06-03 Thread Yonik Seeley
On Tue, Jun 3, 2014 at 4:44 PM, Brett Hoerner br...@bretthoerner.com wrote: If I run a query like this, fq=text:lol fq=created_at_tdid:[1400544000 TO 1400630400] It takes about 6 seconds. Following queries take only 50ms or less, as expected because my fqs are cached. However, if I change

Re: Is the act of *caching* an fq very expensive? (seems to cost 4 seconds in my example)

2014-06-03 Thread Yonik Seeley
On Tue, Jun 3, 2014 at 5:19 PM, Yonik Seeley yo...@heliosearch.com wrote: So try: q=*:* fq=created_at_tdid:[1400544000 TO 1400630400] vs So try: q=*:* fq={!cache=false}created_at_tdid:[1400544000 TO 1400630400] -Yonik http://heliosearch.org - facet functions, subfacets, off-heap

Re: Is the act of *caching* an fq very expensive? (seems to cost 4 seconds in my example)

2014-06-03 Thread Yonik Seeley
On Tue, Jun 3, 2014 at 9:48 PM, Brett Hoerner br...@bretthoerner.com wrote: Yonik, I'm familiar with your blog posts -- and thanks very much for them. :) Though I'm not sure what you're trying to show me with the q=*:* part? I was of course using q=*:* in my queries, but I assume you mean to

Re: slow performance on simple filter

2014-05-31 Thread Yonik Seeley
On Sat, May 31, 2014 at 8:47 AM, mizayah miza...@gmail.com wrote: i show you my full query it's rly simple one q=*:* and fq=class_name:CdnFile debug q shows that process of q takes so long. single filter is critical here. 400ms is too long... something is strange. One possibility is that

Re: Regex with local params is not working

2014-05-28 Thread Yonik Seeley
On Wed, May 28, 2014 at 1:41 AM, Lokn nlokesh...@gmail.com wrote: Thanks for the reply. I am using edismax for the query parsing. Still it's not working. Instead of using local params, if I use the field directly then regex is working fine. It's not for me... This does not work:

Re: Regex with local params is not working

2014-05-27 Thread Yonik Seeley
On Tue, May 27, 2014 at 4:38 AM, Lokn nlokesh...@gmail.com wrote: With solr local params, the regex is not working. My sample query: q ={!qf=$myfield_qf}/[a-d]ad/, where I have myfield_qf defined in the solrconfig.xml. add debugQuery=true to the request to see what query is actually produced.

Re: 答复: Internals about Too many values for UnInvertedField faceting on field xxx

2014-05-27 Thread Yonik Seeley
On Mon, May 26, 2014 at 9:21 PM, 张月祥 zhan...@calis.edu.cn wrote: Thanks a lot. There are only 256 byte arrays to hold all of the ord data, and the pointers into those arrays are only 24 bits long. That gets you back to 32 bits, or 4GB of ord data max. It's practically less since you only

Re: Query translation of User Fields

2014-05-25 Thread Yonik Seeley
On Thu, May 22, 2014 at 10:56 AM, Jack Krupansky j...@basetechnology.com wrote: Hmmm... that doesn't sound like what I would have expected - I would have thought that Solr would throw an exception on the user field, rather than simply treat it as a text keyword. No, I believe that's working as

Re: 答复: Internals about Too many values for UnInvertedField faceting on field xxx

2014-05-25 Thread Yonik Seeley
On Sat, May 24, 2014 at 9:50 PM, 张月祥 zhan...@calis.edu.cn wrote: Thanks for your reply. I'll try it. We're still interested in the real limitation about Too many values for UnInvertedField faceting on field xxx . Could anybody tell us some internals about Too many values for

Re: How does query on few-hits AND many-hits work

2014-05-23 Thread Yonik Seeley
On Fri, May 23, 2014 at 11:37 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: Per Steffensen [st...@designware.dk] wrote: * It IS more efficient to just use the index for the no_dlng_doc_ind_sto-part of the request to get doc-ids that match that part and then fetch timestamp-doc-values for

Re: Extensibility and code reuse: SOLR vs Lucene

2014-05-20 Thread Yonik Seeley
On Tue, May 20, 2014 at 6:01 PM, Achim Domma do...@procoders.net wrote: - I found several times code snippets like if (collector instanceof DelegatingCollector) { ((DelegatingCollector)collector).finish() } . Such code is considered bad practice in every OO language I know. Do I miss

Re: Solr performance: multiValued filed vs separate fields

2014-05-17 Thread Yonik Seeley
On Thu, May 15, 2014 at 10:29 AM, danny teichthal dannyt...@gmail.com wrote: I wonder about performance difference of 2 indexing options: 1- multivalued field 2- separate fields The case is as follows: Each document has 100 “properties”: prop1..prop100. The values are strings and there is no

Re: deep paging without sorting / keep IRs open

2014-05-17 Thread Yonik Seeley
On Wed, May 14, 2014 at 8:34 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Basically I need the ability to keep running searches against a specified commit point / index reader / state of the Lucene / Solr index. I think searcher leases would fit the bill here?

Re: deep paging without sorting / keep IRs open

2014-05-17 Thread Yonik Seeley
On Sat, May 17, 2014 at 10:30 AM, Yonik Seeley yo...@heliosearch.com wrote: I think searcher leases would fit the bill here? https://issues.apache.org/jira/browse/SOLR-2809 Not yet implemented though... FYI, I just put up a simple LeaseManager implementation on that issue. -Yonik http

Re: Question regarding the lastest version of HeliosSearch

2014-05-16 Thread Yonik Seeley
On Thu, May 15, 2014 at 3:44 PM, Jean-Sebastien Vachon jean-sebastien.vac...@wantedanalytics.com wrote: I spent some time today playing around with subfacets and facets functions now available in helios search 0.05 and I have some concerns... They look very promising . Thanks, glad for the

Re: Histogram facet?

2014-05-06 Thread Yonik Seeley
On Mon, May 5, 2014 at 6:18 PM, Romain romain@gmail.com wrote: Hi, I am trying to plot a non date field by time in order to draw an histogram showing its evolution during the week. For example, if I have a tweet index: Tweet: date retweetCount 3 tweets indexed: Tweet | Date |

Re: query(subquery, default) filters results

2014-05-06 Thread Yonik Seeley
On Tue, May 6, 2014 at 5:08 AM, Matteo Grolla matteo.gro...@gmail.com wrote: Hi everybody, I'm having troubles with the function query query(subquery, default) http://wiki.apache.org/solr/FunctionQuery#query running this

Re: Histogram facet?

2014-05-06 Thread Yonik Seeley
On Tue, May 6, 2014 at 5:30 PM, Romain Rigaux rom...@cloudera.com wrote: This looks nice! The only missing piece for more interactivity would be to be able to map multiple field values into the same bucket. e.g. http://localhost:8983/solr/query? q=*:* facet=true

HDS 4.8.0_01 released - solr tomcat distro

2014-05-01 Thread Yonik Seeley
For those Tomcat fans out there, we've released HDS 4.8.0_01, based on Solr 4.8.0 of course. HDS is pretty much just Apache Solr, with the addition of a Tomcat based server. Download: http://heliosearch.com/heliosearch-distribution-for-solr/ HDS details: - includes a pre-configured (threads,

Re: TB scale

2014-04-25 Thread Yonik Seeley
How many documents? That can be just as important (often more important) than total index size. Some other details, like the types of requests, would be helpful (i.e. what the index will be used for... the latency requirements of requests, if you will be faceting, etc). -Yonik

Re: Filter query with multiple raw/literal ORs

2014-04-04 Thread Yonik Seeley
On Fri, Apr 4, 2014 at 5:28 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: On Fri, Apr 4, 2014 at 4:08 AM, Yonik Seeley yo...@heliosearch.com wrote: Try adding a space before the first term, so the default lucene query parser will be used: Yonik, I'm curious, whether it a feature

<    1   2   3   4   5   6   7   8   9   10   >