Re: Adding replica to a shard with only down replicas

2020-02-14 Thread tedsolr
Overnight the replicas with a state of "down" changed to "recovery_failed". Nothing I did. So I brought down both nodes, then started one and waited 5 min. A leader was born then I started the other node. So luckily no heroics were needed. I'll remember your advice about creating a parallel

Re: Adding replica to a shard with only down replicas

2020-02-14 Thread tedsolr
Yes I did Erick, and that didn't do it. What about manual manipulation of the zookeeper data? Rather than telling the customer they need to rebuild from scratch, I'd prefer to attempt some last minute heroics. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Adding replica to a shard with only down replicas

2020-02-13 Thread tedsolr
Solr 5.5.4. I have a collection with a single shard and two replicas. Both are reporting down. No shard leader exists. Each replica is on a different node. Should it be safe to attempt an ADDREPLICA command? Since there's no leader I don't know if that will work. This is the cluster state for the

Re: Solr 6.x and java 8

2018-09-21 Thread tedsolr
Shawn, My application environment runs java 1.8. However I'm stuck building to 1.7 for now. I can still use SolrJ 6.1 in my app as long as I only deploy the SolrJ JAR and not build it from source. Right? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Solr 6.x and java 8

2018-09-20 Thread tedsolr
I realize that a java 8 runtime environment is required for Solr 6.x. Is it also necessary to compile to java 8 for any custom plugins running on the solr server? What about including SolrJ libraries in client code that is still compiling to 1.7? thanks, Ted -- Sent from:

Re: Search for a specific unicode char

2018-07-31 Thread tedsolr
This is an example of what the data looks like: "SOURCEFILEID":"77907", "APPROP_GROUP_CODE_T":"F\uG\uR", "APPROP_GROUP_CODE_T_aggr":"F\uG\uR", "APPROP_GROUP_CODE_T_search":"F\uG\uR", "OBJECT_DESC_T":"OTHER PROFESSIONAL/TECHNICAL SERVICES",

Search for a specific unicode char

2018-07-31 Thread tedsolr
I'm having some trouble with non printable, but valid, UTF8 chars when exporting to Amazon Redshift. The export fails but I can't yet find this data in my Solr collection. How can I search, say from the admin console, for a particular character? I'm looking for U+001E and U+001F thanks! Solr

Search support for regex style spaces

2018-04-24 Thread tedsolr
Does Solr have regex search support for "\s"? as in: q=FIELD:/starts with[\s0-9]*/ Both \s and \\s do not seem to have an effect. thanks using solr 5.5.4 -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Create collection error in 5.5.4

2017-12-21 Thread tedsolr
I upgraded Solr from 5.2.1 to 5.5.4 recently. Occasionally when creating a new collection via the Collections API I get an error: Could not fully create collection: . This has never happened previous to this upgrade. It's happened twice in my development environment and once in my user test

Re: Moving solr home

2017-04-18 Thread tedsolr
tallation scripts to deploy. This > makes it very easy for Prod deployments and allows to decouple data/index > directory with Solr binaries. See below link > > https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production > > Thanks, > Susheel > > On M

Moving solr home

2017-04-17 Thread tedsolr
I have a solr cloud cluster (v5.2.1 on redhat linux) that uses the default location for solr home: (install dir)/server/solr. I would like to move the index data somewhere else to make upgrades easier. When I set a SOLR_HOME variable solr appears to be ignoring it - and even creating a solr.xml

Re: Collection will not replicate

2017-02-03 Thread tedsolr
il too unless you set shards.tolerant. > > You really wouldn't want your docs lost is the reasoning. > > On Feb 2, 2017 6:56 AM, "tedsolr" > tsmith@ > wrote: > >> Can I assume that without a leader the shard will not respond to write >> requests? I can

Re: Collection will not replicate

2017-02-02 Thread tedsolr
17 at 1:57 PM, Jeff Wartes > jwartes@ > wrote: >> Sounds similar to a thread last year: >> http://lucene.472066.n3.nabble.com/Node-not-recovering-leader-elections-not-occuring-tp4287819p4287866.html >> >> >> >> On 2/1/17, 7:49 AM, "tedsolr" &g

Re: Collection will not replicate

2017-02-01 Thread tedsolr
Update! I did find an error: 2017-02-01 09:23:22.673 ERROR org.apache.solr.common.SolrException :org.apache.solr.common.SolrException: Error getting leader from zk for shard shard1 Caused by: org.apache.solr.common.SolrException: Could not get leader props at

Collection will not replicate

2017-02-01 Thread tedsolr
I have a collection (1 shard, 2 replicas) that was doing a batch update when one solr host ran out of disk space. The batch job failed at that point, and one replica got corrupted. I deleted the bad replica. I've tried several times since then to add a new replica. The status of the request is

Re: Reindex after schema change options

2016-10-28 Thread tedsolr
on it to using the /export request handler and that was good - no errors. Still would love to hear from anyone who has done this differently. tedsolr wrote > Not all my fields use docValues. This is going to be a problem in the > future. Once I change the schema.xml to use docValues for these certain &

Reindex after schema change options

2016-10-27 Thread tedsolr
Not all my fields use docValues. This is going to be a problem in the future. Once I change the schema.xml to use docValues for these certain field types, how do I reindex the data in place - without starting from the source? I'm aware of lucene's IndexUpgrader but that will only ensure a correct

ShardDoc.sortFieldValues are not exposed in v5.2.1

2016-09-01 Thread tedsolr
I'm attempting to perform my own merge of IDs with a MergeStrategy in v5.2.1. I'm a bit hamstrung because the ShardFieldSortedHitQueue is not public. When trying to build my own priority queue I found out that the field sortFieldValues in ShardDoc is package restricted. Now, in v6.1 I see that

Re: DocTransformer not executing with AnalyticsQuery

2016-08-15 Thread tedsolr
The formatting in my question was wiped, reposting I have an AnalyticsQuery that takes several params computed at runtime because they are dynamic. There is a MergeStrategy that needs to combine stats data and merge doc Ids. There is a DocTransformer that injects some stats into each returned

DocTransformer not executing with AnalyticsQuery

2016-08-15 Thread tedsolr
I have an AnalyticsQuery that takes several params computed at runtime because they are dynamic. There is a MergeStrategy that needs to combine stats data and merge doc Ids. There is a DocTransformer that injects some stats into each returned doc. I cannot seem to get all the pieces to work

Re: AnalyticsQuery fails on a sharded collection

2016-08-11 Thread tedsolr
OK, some more info ... it's not aggregating because the doc values it's using for grouping are the unique ID field's. There are some big differences in the whole flow between searches against a single shard collection, and searches against a multi-shard collection. In a single shard collection the

Re: AnalyticsQuery fails on a sharded collection

2016-08-10 Thread tedsolr
Quick update: the NPE was related to the way in which I passed params into the Query via solrconfig.xml. It works fine for single sharded, but something about it was masking the unique ID field in a multisharded environment. Anyway, I was able to fix that by cleaning up the request handler config:

Re: AnalyticsQuery fails on a sharded collection

2016-08-10 Thread tedsolr
I still haven't found the reason for the NPE in my post filter when it runs against a sharded collection, so I'm posting my code in the hopes that a seasoned Solr pro might notice something. I thought perhaps not treating the doc values as multi doc values when indexes are segmented might have

Re: Can a MergeStrategy filter returned docs?

2016-08-09 Thread tedsolr
After some more digging I've learned that the Query gets called more than once on the same shard, but with a different shard purpose. I don't understand the flow, but I assume that one call is triggering the transform() via a path that does not pass through the document collector. I also don't

Re: Can a MergeStrategy filter returned docs?

2016-08-08 Thread tedsolr
Some more info that might be helpful. If I can trust my logging this is what's happening (search with rows=3 on collection with 2 shards): 1) delegating collector finish() method places custom data on request object for _shard 1_ 2) doc transformer transform() method is called for 3 requested

Re: Can a MergeStrategy filter returned docs?

2016-08-08 Thread tedsolr
That makes sense. I would prefer to just merge the custom analytics, but sending that much info via the solr response seems very slow. However I still can't figure out how to access the custom analytics in a doc transformer. That would provide the fastest response but I would have to merge the Ids

Re: Can a MergeStrategy filter returned docs?

2016-08-05 Thread tedsolr
I don't see any field level data exposed in the SolrDocumentList I get from shardResponse.getSolrResponse().getResponse().get("response"). I see the unique ID field and value. Is that by design or am I being stupid? Separate but related question: the mergIds() method in the merge strategy class -

Re: Can a MergeStrategy filter returned docs?

2016-08-04 Thread tedsolr
too late to drop documents in the merge? > > If you can provide a very simple example with some sample records and a > sample output, that would be helpful. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Thu, Aug 4, 2016 at 4:25 PM, tedsolr > tsmith@ > wrote: &g

Can a MergeStrategy filter returned docs?

2016-08-04 Thread tedsolr
I've been struggling just to get my search plugin working for sharded collections, but I haven't ascertained if my end goal is even achievable. I have a plugin that groups documents that are considered duplicates (based on multiple fields - like the CollapsingQParserPlugin). When responses come

Re: QParsePlugin not working on sharded collection

2016-08-04 Thread tedsolr
So my implementation with a DocTransformer is causing an exception (with a sharded collection): ERROR - 2016-08-04 09:41:44.247; [ShardTest1 shard1_0 core_node3 ShardTest1_shard1_0_replica1] org.apache.solr.common.SolrException;

Re: QParsePlugin not working on sharded collection

2016-08-04 Thread tedsolr
Thanks Erick, you answered my question by pointing out the aggregator. I didn't realize a merge strategy was _required_ to return stats info when there are multiple shards. I'm having trouble with my actual plugin so I've scaled back to the simplest possible example. I'm adding to it little by

Re: QParsePlugin not working on sharded collection

2016-08-03 Thread tedsolr
So I notice if I create the simplest MergeStrategy I can get my test values from the shard responses and then if I add info to the SolrQueryResponse it gets back to the caller. I still must be missing something. I wouldn't expect to have different code paths - one for single shard one for multi

QParsePlugin not working on sharded collection

2016-08-03 Thread tedsolr
I'm trying to verify that a very simple custom post filter will work on a sharded collection. So far it doesn't. Here are the search results on my single shard test collection: { "responseHeader": { "status": 0, "QTime": 17 }, "thecountis": "946028", "myvar": "hello",

Re: AnalyticsQuery fails on a sharded collection

2016-07-28 Thread tedsolr
Thanks Joel! However I'm come to realize that upgrading to Solr 6 is not a near term reality due to the Java 8 requirement. I don't want anyone to waste their time debugging my code. At least not until I've made time to really work through it myself. I was just looking for a pointer on

AnalyticsQuery fails on a sharded collection

2016-07-27 Thread tedsolr
I'm looking to create a merge strategy for a custom QParserPlugin I have. The plugin works fine on collections with one shard. I was very surprised to see it throw an exception when I ran it against a sharded collection. So my question is a bit of a shot in the dark. I'll first note that the

Re: Search sort depth limited to 4?

2016-07-26 Thread tedsolr
So I found the limit in the Ref Doc p. 394, under the /export request handler: "Up to four sort fields can be specified per request, with the 'asc' or 'desc' properties" Yikes I'm in trouble. Does anyone know if this can be circumvented? Can I write a custom handler that could handle up to 20?

Search sort depth limited to 4?

2016-07-26 Thread tedsolr
Hi, I'm trying to group search results by fields using the streaming API. I don't see a sort limit mentioned in the Solr Ref Doc, but when I use 4 fields I get results and when I use 5 or more I get an exception: java.util.concurrent.ExecutionException: java.io.IOException: JSONTupleStream:

Re: Should streaming place load on the app server?

2016-07-22 Thread tedsolr
gt; On Fri, Jul 22, 2016 at 11:23 AM, tedsolr > tsmith@ > wrote: > >> The streaming API looks like it's meant to be run from the client app >> server >> - very similar to a standard Solr search. When I run a basic streaming >> operation the memory consumption occu

Should streaming place load on the app server?

2016-07-22 Thread tedsolr
The streaming API looks like it's meant to be run from the client app server - very similar to a standard Solr search. When I run a basic streaming operation the memory consumption occurs on the app server jvm, not the solr server jvm. The opposite of what I was expecting. (pseudo code) Stream A

Re: Specify sorting of merged streams

2016-07-21 Thread tedsolr
The primary use case seems to require a SortStream. Ignoring the large join for now... 1. search main collection with stream (a) 2. search other collection with stream (b) 3. hash join a & b (c) 4. full sort on c 5. aggregate c with reducer 6. apply user sort criteria with top It's very likely

Re: Specify sorting of merged streams

2016-07-21 Thread tedsolr
I can see I may need to rethink some things. I have two joins: one is 1 to 1 (very large) and one is 1 to .03. A HashJoin may work on the smaller one. The large join looks like it may not be possible. I could get away with treating it as a filter somehow - I don't need the fields from the

Re: Specify sorting of merged streams

2016-07-20 Thread tedsolr
I'm hoping I'm just not using the streaming API correctly. I have about 30M docs (~ 15 collections) in production right now that work well with just 4GB of heap (no streaming). I can't believe streaming would choke on my test data. I guess there are 2 primary requirements. Reindexing an entire

Re: Specify sorting of merged streams

2016-07-20 Thread tedsolr
I am getting an OOM error trying to combine streaming operations. I think the sort is the issue. This test was done on a single replica cloud setup of v6.1 with 4GB heap. col1 has 1M docs. col2 has 10k docs. The search for each collection was q=*:*. Using SolrJ: CloudSolrStream searchStream = new

Re: Specify sorting of merged streams

2016-06-30 Thread tedsolr
I've read about the sort stream in v6.1 but it appears to me to break the streaming design. If it has to read all the results into memory then it's not streaming. Sounds like it could be slow and memory intensive for very large result sets. Has anyone had good results with the sort stream when

Specify sorting of merged streams

2016-06-29 Thread tedsolr
I'm looking at the streaming API as an alternative to post filtering. The first thing I noticed is the restrictions on sorting. If I have two collections, one has the true field data and the other has linked markers, how can I combine them but also provide the sort desired by the end user?

Re: OT: is Heliosearch discontinued?

2016-06-10 Thread tedsolr
That's fantastic! Thanks Joel -- View this message in context: http://lucene.472066.n3.nabble.com/OT-is-Heliosearch-discontinued-tp4242345p4281792.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Simulate doc linking via post filter cache check

2016-06-10 Thread tedsolr
The terms component will not work for me because it holds on to terms from deleted documents. My indexes are too volatile. I could perform a search for every match - but that would not perform. Maybe I need something that can compare two searches. Anyone know of an existing filter component does

Re: OT: is Heliosearch discontinued?

2016-06-10 Thread tedsolr
There were many great white papers hosted on that old site. Does anyone know if they were moved? I've got lots of broken links - I wish I could get to that reference material. -- View this message in context:

Re: Simulate doc linking via post filter cache check

2016-05-10 Thread tedsolr
Mikhail, that's an interesting idea. If a terms list could stand in for a cache that may be helpful. What I don't fully see is how the search would work. Building an explicit negative terms query with returned IDs doesn't seem possible as that list would be in the millions. To drastically speed my

Re: auto purge for embedded zookeeper

2016-05-10 Thread tedsolr
That makes perfect sense Shawn. I will clean up the old log data the old fashioned way. thanks, Ted -- View this message in context: http://lucene.472066.n3.nabble.com/auto-purge-for-embedded-zookeeper-tp4275561p4275857.html Sent from the Solr - User mailing list archive at Nabble.com.

Simulate doc linking via post filter cache check

2016-05-10 Thread tedsolr
I'm pulling my hair out on this one - and there's not much of that to begin with. The problem I have is that updating 10M denormalized docs in a collection takes about 5 hours. Soon there will be collections with 100M docs and a 50 hour update cycle will not be acceptable. The process involves

auto purge for embedded zookeeper

2016-05-09 Thread tedsolr
I have a development environment that is using an embedded zookeeper, and the zoo_data folder continues to grow. It's filled with snapshot files that are not getting purged. zoo.cfg has properties autopurge.snapRetainCount=10 autopurge.purgeInterval=1 Perhaps it's not in the correct location so

Re: Replicas for same shard not in sync

2016-04-25 Thread tedsolr
Erick, I was referring to the Achieved Replication Factor section of the Solr reference guide Maybe I'm misreading it. If an update succeeds on the leader but fails on the replica, it's a success for the

Re: Replicas for same shard not in sync

2016-04-25 Thread tedsolr
I've done a bit of reading - found some other posts with similar questions. So I gather "Optimizing" a collection is rarely a good idea. It does not need to be condensed to a single segment. I also read that it's up to the client to keep track of updates in case commits don't happen on all the

Replicas for same shard not in sync

2016-04-22 Thread tedsolr
I have a SolrCloud setup with v5.2.1 - just two hosts. A ZK ensemble of 3 hosts. Just today, customers searching in one specific collection reported seeing varying results with the same search. I could confirm this by looking at the logs - same search with different hits by the solr host. In the

Re: Can Solr recognize daylight savings time?

2016-03-25 Thread tedsolr
I've never created a Jira issue for Solr. I have the option to create a Service Desk Request. Which one will route to the Solr board? Kylin, Atlas, Apache Infrastructure, Ranger -- View this message in context:

Re: Can Solr recognize daylight savings time?

2016-03-25 Thread tedsolr
Of course! Thanks for your help Shawn. -- View this message in context: http://lucene.472066.n3.nabble.com/Can-Solr-recognize-daylight-savings-time-tp4266047p4266062.html Sent from the Solr - User mailing list archive at Nabble.com.

Can Solr recognize daylight savings time?

2016-03-25 Thread tedsolr
My solr logs are an hour behind. I have set this property to log in local time SOLR_TIMEZONE="EST" but cannot find a property that will "turn on" daylight savings. If there isn't a solr property, maybe there's an apache log4j setting? thanks, v5.2.1 -- View this message in context:

Re: Performance potential for updating (reindexing) documents

2016-03-24 Thread tedsolr
Hi Erick, My post was scant on details. The numbers I gave for collection sizes are projections for the future. I am in the midst of an upgrade that will be completed within a few weeks. My concern is that I may not be able to produce the throughput necessary to index an entire collection quickly

Performance potential for updating (reindexing) documents

2016-03-24 Thread tedsolr
With a properly tuned solr cloud infrastructure and less than 1B total docs spread out over 50 collections where the largest collection is 100M docs, what is a reasonable target goal for entirely reindexing a single collection? I understand there are a lot of variables, so I'm hypothetically

Re: replicate indexing to second site

2016-02-10 Thread tedsolr
Cross data center replication sounds like a great feature. I read Yonik's post on it. I'll keep my ear to the ground. In the meantime it's good to know there's nothing built in to handle this, so it will involve some design effort. I have my head wrapped around sending index requests in parallel,

Re: replicate indexing to second site

2016-02-10 Thread tedsolr
Arcadius, Thanks for sharing your multi data center design. My requirements are different (hot site - warm site) but nevertheless your posts are very interesting. It helps to know that in many cases someone else has already cut their teeth on the problem you're trying to solve. Ted -- View

replicate indexing to second site

2016-02-09 Thread tedsolr
I have a Solr Cloud cluster (v5.2.1) using a Zookeeper ensemble in my primary data center. I am now trying to plan for disaster recovery with an available warm site. I have read (many times) the disaster recovery section in the Apache ref guide. I suppose I don't fully understand it. What I'd

Re: how to control location of solr PID file

2015-12-07 Thread tedsolr
Thanks Hoss. I did not use the service install script to install solr. I don't have the service tool setup as a shortcut. I suppose I'll just have to specify the port when shutting down a node since "-all" does not work. Obviously I'm not a unix admin. -- View this message in context:

how to control location of solr PID file

2015-12-05 Thread tedsolr
I'm running v5.2.1 on red hat linux. The "solr status" command is not recognizing all my nodes. Consequently, "solr stop -all" only stops the node on 8983. What's the recommended file structure for placing multiple nodes on the same host? I am trying multiple "solr" folders within the same solr

Some errors migrating to solr cloud

2015-12-04 Thread tedsolr
I had a fairly simple plan for migrating my single solr instance with multiple cores, to a solrcloud implementation where core => collection. My testing locally (windows) worked fine, but the first linux (development) environment I tried to migrate had some failures. This is v5.2.1. The setup:

Solr logging in local time

2015-11-16 Thread tedsolr
Is it possible to define a timezone for Solr so that logging occurs in local time? My logs appear to be in UTC. Due to daylight savings, I don't think defining a GMT offset in the log4j.properties files will work. thanks! Ted v. 5.2.1 -- View this message in context:

Re: Solr logging in local time

2015-11-16 Thread tedsolr
There are more than a dozen logging sources that are aggregated into Splunk for my application. Solr is only one of them. All the others are logging in local time. Perhaps there is a Splunk centric solution, but I would like to know what the alternatives are. Anyone know how to "fix" (as in

Re: Solr logging in local time

2015-11-16 Thread tedsolr
There is a property for timezone. Just set that in solr.in.sh and logging will use it. The default is UTC. SOLR_TIMEZONE="EST" -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-logging-in-local-time-tp4240369p4240434.html Sent from the Solr - User mailing list archive

SolrClient: reuse the client or just call close()?

2015-11-04 Thread tedsolr
I'm wondering what the best practice is for the implementations of SolrClient: CloudSolrClient & HttpSolrClient. I am caching my clients per collection (core) and reusing them right now. Initially this was prompted by the old solr wiki page and SOLR-861. Is

creating collection with solr5 - missing config data

2015-11-02 Thread tedsolr
I'm trying to plan a migration from a standalone solr instance to the solrcloud. I understand the basic steps but am getting tripped up just trying to create a new collection. For simplicity, I'm testing this on a single machine, so I was trying to use the embedded zookeeper. I can't figure out

Re: creating collection with solr5 - missing config data

2015-11-02 Thread tedsolr
Thanks Erick, that did it. I had thought the -z option was only for external zookeepers. Using port 9983 allowed me to upload a config. -- View this message in context: http://lucene.472066.n3.nabble.com/creating-collection-with-solr5-missing-config-data-tp4237802p4237811.html Sent from the

Re: Merging documents from a distributed search

2015-09-08 Thread tedsolr
Joel, It needs to perform. Typically users will have 1 - 5 million rows in a query, returning 10 - 15 fields. Grouping reduces the return by 50% or more normally. Responses tend be less than a half second. It sounds like the manipulation of docs at the collector level has been left to the single

Re: Merging documents from a distributed search

2015-09-04 Thread tedsolr
Upayavira , The docs are all unique. In my example the two docs are considered to be dupes because the requested fields all have the same values. fields AB C D E Doc 1: apple, 10, 15, bye, yellow Doc 2: apple, 12, 15, by, green The two docs are certainly unique. Say they are on

RE: Merging documents from a distributed search

2015-09-03 Thread tedsolr
Markus, did you mistakingly post a link to this same thread? -- View this message in context: http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802p4227035.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Merging documents from a distributed search

2015-09-03 Thread tedsolr
Thanks Joel, that link looks promising. The CloudSolrStream bypasses my issue of multiple shards. Perhaps the ReducerStream would provide what I need. At first glance I worry that the the buffer would grow too large - if its really holding the values for all the fields in each document

Re: Custom merge logic in SolrCloud.

2015-09-03 Thread tedsolr
I am facing a similar issue. See this thread -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-merge-logic-in-SolrCloud-tp4226325p4227073.html Sent from the Solr - User

Merging documents from a distributed search

2015-09-02 Thread tedsolr
I've read from http://heliosearch.org/solrs-mergestrategy/ that the AnalyticsQuery component only works for a single instance of Solr. I'm planning to "migrate" to the SolrCloud soon and I have a custom AnalyticsQuery module that collapses what I

Re: How to find the ordinal for a numeric doc value

2015-08-20 Thread tedsolr
I see. The UninvertingReader even throws an IllegalStateException if you try read a numeric field as a sorted doc values. I may have to index extra fields to support my document collapsing scheme. Thanks for responding. -- View this message in context:

How to find the ordinal for a numeric doc value

2015-08-19 Thread tedsolr
I'm trying to upgrade my custom post filter from Solr 4.9 to 5.2. This filter collapses documents based on a user chosen field set. The key to the whole thing is determining document uniqueness based on a fixed int array of field value ordinals. In 4.9 this worked regardless of the field type. In

Re: How to find the ordinal for a numeric doc value

2015-08-19 Thread tedsolr
One error (others perhaps?) in my statement ... the code searcher.getLeafReader().getSortedDocValues(field) just returns null for numeric and date fields. That is why they appear to be ignored, not that the ordinals are all absent or equivalent. But my question is still valid I think! -- View

Re: Migrating from solr cores to collections

2015-07-15 Thread tedsolr
After playing with SolrCloud I answered my own question: multiple collections can live on the same node. Following the how-to in the solr-ref-guide was getting me confused. -- View this message in context:

Migrating from solr cores to collections

2015-07-14 Thread tedsolr
I am in the process of migrating from a single Solr instance, with multiple cores, to the SolrCloud. My product uses cores to physically separate our customers' data: CocaCola has its own core, Pepsi has its own, etc. I want to keep that physical separation but of course I need horizontal scaling

Re: Reverse deep paging

2015-02-03 Thread tedsolr
Oh, I know I have problems! My (b) option of reversing sort and using the current cursor mark is not working. It gets off by one record. paging forward: pg 1: docs 1-10 pg 2: docs 11-20 pg 3: docs 21-30 now paging backwards: pg 2: docs 10-19 I'll go back to tracking all the cursor marks. --

Reverse deep paging

2015-02-02 Thread tedsolr
I'm surprised I haven't seen a post on this, but maybe the answers are obvious. I'm using a cursor to page the results. If I want to enable reverse paging (go back one page) I have to either: a) Keep a map of all cursor marks the user made paging forward. This map could get very long if a user

Dynamically change sort?

2015-01-27 Thread tedsolr
Hi. I'm trying to sort on computed values that my QParserPlugin creates. I've found out that I can make it happen by adding a fake scorer to the delegates before collect() is called. The problem I have now is how to modify the sort field in mid stream? The user selects COUNT as a sort field, but

Re: Dynamically change sort?

2015-01-27 Thread tedsolr
Brilliant! I didn't know what the prepare() method of a SearchComponent could do. I can modify the SolrParams by request.setParams() and add custom data to the request context to tell my Collector how to figure the score. Thank you! -- View this message in context:

Re: Sorting on a computed value

2015-01-26 Thread tedsolr
That's an interesting link Shawn. Especially since it mentions the possibility of sorting on pseudo-fields. My delegating collector computes the customs stats and stores them in the request context. I have a doc transformer that then grabs the stats for each doc and inserts the data in the

Sorting on a computed value

2015-01-25 Thread tedsolr
I'll bet some super user has figured this out. How can I perform a sort on a single computed field? I have a QParserPlugin that is collapsing docs based on data from multiple fields. I am summing the values from one numerical field 'X'. I was going to use a DocTransformer to inject that summed

How to inject custom response data after results have been sorted

2015-01-23 Thread tedsolr
Hello! With the help of this community I have solved 2 problems on my way to creating a search that collapses documents based on multiple fields. The CollapsingQParserPlugin was key. I have a new problem now. All the custom stats I generate in my custom QParser makes for way to much data to

Re: How to inject custom response data after results have been sorted

2015-01-23 Thread tedsolr
Thank you so much for your responses Hoss and Shalin. I gather the DocTransfomer allows manipulations to the doc list returned in the results. That is very cool. So the transformer has access to the Solr Request. I haven't seen the hook yet, but I believe you - I'll have to keep looking. It would

Re: How to return custom collector info

2015-01-21 Thread tedsolr
I was confused because I couldn't believe my jars might be out of sync. But of course they were. I had to create a new eclipse project to sort it out, but that exception has disappeared. Sorry for the confusing post. -- View this message in context:

Re: How to return custom collector info

2015-01-20 Thread tedsolr
Joel, Thank you for the links. The AnalyticsQuery is just the thing I need to return custom stats in the response. What I'm struggling with now, is how to read the doc field values. I've been following the CollapsingQParserPlugin model of accessing the field cache in the Query class

How to return custom collector info

2015-01-19 Thread tedsolr
I am investigating possible modifications to the CollapsingQParserPlugin that will allow me to collapse documents based on multiple fields. In a quick test I was able to make this happen with two fields, so I assume I can expand that to N fields. What I'm missing now is the extra data I need per

Re: Engage custom hit collector for special search processing

2015-01-14 Thread tedsolr
Thank you so much Alex and Joel for your ideas. I am pouring through the documentation and code now to try an understand it all. A post filter sounds promising. As 99% of my doc fields are character based I should try to compliment the collapsing Q parser with an option that compares string fields

Re: Engage custom hit collector for special search processing

2015-01-13 Thread tedsolr
As insane as it sounds, I need to process all the results. No one document is more or less important than another. Only a few hundred unique docs will be sent to the client at any one time, but the users expect to page through them all. I don't expect sub-second performance for this task. I'm

Engage custom hit collector for special search processing

2015-01-13 Thread tedsolr
I have a complicated problem to solve, and I don't know enough about lucene/solr to phrase the question properly. This is kind of a shot in the dark. My requirement is to return search results always in completely collapsed form, rolling up duplicates with a count. Duplicates are defined by

exporting to CSV with solrj

2014-10-31 Thread tedsolr
I am trying to invoke the CSVResponseWriter to create a CSV file of all stored fields. There are millions of documents so I need to write to the file iteratively. I saw a snippet of code online that claimed it could effectively remove the SorDocumentList wrapper and allow the docs to be retrieved

Re: exporting to CSV with solrj

2014-10-31 Thread tedsolr
Sure thing, but how do I get the results output in CSV format? response.getResults() is a list of SolrDocuments. -- View this message in context: http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166861.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: exporting to CSV with solrj

2014-10-31 Thread tedsolr
I think I'm getting the idea now. You either use the response writer via an HTTP call, or you write your own exporter. Thanks to everyone for their input. -- View this message in context: http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166889.html Sent from the Solr -

  1   2   >