Hi ,
I am planning to write custom aggregator in solr which will use some
probabilistic data structures per shard to accumate results and then after
shard merging results will be sent to user as integer.
I explored 2 options to do this
1. Solr analytics API
bq: My guess so far is that the filter has
to fetch the unique key for all documents in results, which consumes a
lot of resources.
Guessing here and going from memory, but... If you have some code
like reader.get(doc).get("id") it'll totally barf. Problem here is that to
get the id field, it has
On 3/17/2016 10:39 AM, Victor D'agostino wrote:
> I have a java.lang.ClassNotFoundException: solr.MockTokenizerFactory
> after a fresh 5.5.0 setup with DIH and a collection named "db".
>
> The tgz file is from
> http://apache.crihan.fr/dist/lucene/solr/5.5.0/solr-5.5.0.tgz
>
> Any idea why this
Shawn, thank you very much !
So, I didn't have an account in the old wiki, can you add me as contributor
?
Just created.
I will then proceed adding the classification documentation.
AlessandroBenedetti
benedetti.ale...@gmail.com
Cheers
On Wed, Mar 16, 2016 at 1:01 AM, Shawn Heisey
Like Francisco said, use a custom update processor to map the fields the
way you want and add it to your update chain.
On Wed, 16 Mar 2016, 18:16 Francisco Andrés Fernández,
wrote:
> Vidya, I don't know if I'm understanding it very well but, I think that the
> best way is to
On 3/16/2016 4:33 AM, Zheng Lin Edwin Yeo wrote:
> I found that HMMChineseTokenizer will split a string that consist of
> numbers and characters (alphanumeric). For example, if I have a code that
> looks like "1a2b3c4d", it will be split to 1 | a | 2 | b | 3 | c | 4 | d
> This has caused the
First of all, "optimize-like" does _not_ happen
"every time a commit happens". What _does_ happen
is the current state of the index is examined and if
certain conditions are met _then_ segment
merges happen. Think of these as "partial optimizes".
This is under control of the TieredMergePolicy by
Well, if using managed schema in SolrCloud, all the updates
to the nodes is automatic so it's easier from that perspective.
To me, the sweet spot for managed schema is that it lends
itself to some kind of front end that allows you to deal with the
schema visually, one can envision widgets,
On Thursday, March 17, 2016 7:58 PM, wun...@wunderwood.org wrote:
>
> Think about using popularity as a boost. If one movie has a million rentals
> and one has a hundred rentals, there is no additive formula that balances
> that with text relevance. Even with log(popularity), it doesn't work.
On 3/17/2016 2:32 PM, Shamik Bandopadhyay wrote:
> [2016-03-17 20:23:34,760]ERROR
> 9350[coreLoadExecutor-7-thread-1-processing-n:54.176.219.134:8983_solr] -
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:827) - Error
> creating core [knowledge]:
That works fine if you have a query that matches things with a wide range of
popularities. But that is the easy case.
What about the query “twilight”, which matches all the Twilight movies, all of
which are popular (millions of views). Or “Lord of the Rings” which only
matches movies with
Popularity has a very wide range. Try my example, scale 1 million and 100 into
the same 1.0-0.0 range. Even with log popularity.
As another poster pointed out, text relevance scores also have a wide range.
In practice, I never could get additive boost to work right at Netflix at both
ends of
Tie does quite a bit, without it only the highest weighted field that has
the term will be included in relevance score. Tie let's you include the
other fields that match as well.
On Mar 18, 2016 10:40 AM, "Robert Brown" wrote:
> Thanks for the added input.
>
> I'll
Hi
No OOTB as I know, but it would be 3 lines to create a custom one, which simply
aborts the chain instead of calling super.processAdd(command)
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 16. mar. 2016 kl. 12.36 skrev solr2020 :
>
> Hi,
>
Hi,
I am writing a Solr Application, can anyone please let me know how to Unit test
the application?
I see we have MiniSolrCloudCluster class available in Solr, but I am confused
about how to use that for Unit testing.
How should I create a embedded server for unit testing?
Thanks,
Naveen
Thanks for saying. I thought as soon as I sent it that my motivation might
just be to brag that I know something that long-time Solr folks like you might
not. I actually know so very little, not just about how Lucene works, but how
to make Solr solve concrete problems beyond the simple. I
Hi
You can look at the Apache Tika project or the PDFBox project to parse your
files before sending to Solr.
Alternatively, if your processing is very simple, you can use the built-in Tika
as U just did, and
then deploy some UpdateRequestProcessor’s in order to modify the Tika output
into
So, each soft commit would create a new searcher that would invalidate
the old cache?
Here is the configuration for Document Cache
autowarmCount="0"/>
true
Thanks
On 3/18/16 12:45 AM, Emir Arnautovic wrote:
Hi,
Your cache will be cleared on soft commits - every two minutes. It seems
that
On Tue, Mar 15, 2016 at 07:58:21PM -0600, Shawn Heisey wrote:
> On 3/15/2016 2:56 PM, Paul Hoffman wrote:
> >> It sure looks like I started Solr from my blacklight project dir.
> >>
> >> Any ideas? Thanks,
> >>
>
> You may need to get some help from the blacklight project. I've got
> absolutely
When using security.json (in Solr 5.4.1 for instance), is there a recommended
method to allow users to change their own passwords? We certainly would not
want to grant blanket security-edit to all users; but requiring users to
divulge their intended passwords (in Email or by other means) to the
HI Shawn,
Thanks for your response.
CDH is a Cloudera (third party) distribution. is there any to get the
notifications copy of it when cluster state changed ? in logs ?
I can assume that the exception is result of no availability of replicas
only. Agree?
Regards,
Anil
On 18 March 2016 at
hey all,
is there any out of the box way to use your stop words to completely skip a
document? if something has X in its description when being indexed i just
want to ignore it altogether / when something is searched with X then go
ahead and automatically return 0 results. quick context: using
Hello,
Please find inline
On Wed, Mar 16, 2016 at 10:10 PM, Alisa Z. wrote:
> Hi all,
> I have a deeply multi-level data structure (up to 6-7 levels deep) where
> due to the nature of the data some nested documents can have same type
> names at various levels. How to form a
Can someone help?
Corporate Executive Board India Private Limited. Registration No:
U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF Building
No.10 DLF Cyber City, Gurgaon, Haryana-122002, India..
This e-mail and/or its attachments are intended only for the use of the
Does using schema API mean that no upconfig to zookeeper and no reloading
of all the nodes in my solrcloud? In which scenario should I not use schema
API, if any?
Thanks
Jay
On Wed, Mar 16, 2016 at 6:22 PM, Shawn Heisey wrote:
> On 3/16/2016 1:14 AM, Alexandre Rafalovitch
HI,
We are using solrcloud with zookeeper and each collection has 5 shareds and
2 replicas.
we are seeing "org.apache.solr.client.solrj.SolrServerException: No live
SolrServers available to handle this request". i dont see any issues with
replicas.
what would be root cause of the exception ?
Hi Matt,
when you say : " soon looking to move to a different approach (ngrams) :
do you mean creating a specific core, with a specific analysis for the
fields of interest ?
Upgrading Solr is not an option in your condition ?
Cheers
On Wed, Mar 16, 2016 at 10:05 PM, Matt Kuiper
27 matches
Mail list logo