Hi Lance,
does this do what you want?
http://maven.apache.org/plugins/maven-assembly-plugin/descriptor-refs.html#jar-with-dependencies
It's maven but that would be an advantage I'd say… ;-)
Chantal
Am 05.08.2012 um 01:25 schrieb Lance Norskog:
Has anybody tried packaging the contrib
Hello,
I've got the problem description below. Can you explain the expected user
experience, and/or solution approach before diving into the algorithm
design?
Thanks
On Sat, Aug 18, 2012 at 2:50 AM, Karthick Duraisamy Soundararaj
karthick.soundara...@gmail.com wrote:
My problem is that when
Hi,
I think the response is yes, but I need to check.
Is it possible to upgrade from solr 3.4 to solr 3.6.1 without rebuilding
the existing index ?
Thank you.
Dominique
Am 09.08.2012 18:02, schrieb Robert Muir:
On Thu, Aug 9, 2012 at 10:20 AM, tech.vronk t...@vronk.net wrote:
Hello,
I wonder how to figure out the total token count in a collection (per
index), i.e. the size of a corpus/collection measured in tokens.
You want to use this statistic, which
Date queries are described here: http://wiki.apache.org/solr/SolrQuerySyntax
You must first make sure your dates end up in a Date fieldType and are in the
proper format.
-Original message-
From:Dotan Cohen dotanco...@gmail.com
Sent: Mon 20-Aug-2012 13:57
To:
On Mon, Aug 20, 2012 at 3:00 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
Date queries are described here: http://wiki.apache.org/solr/SolrQuerySyntax
Terrific, thank you!
You must first make sure your dates end up in a Date fieldType and are in the
proper format.
Thanks.
--
It's pretty easy to accidentally run into the AWT stuff if you're
doing anything that involves image processing, which I would expect a
generic RTF parser might do.
Michael Della Bitta
Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017
After pondering it for a while I decided to take the advice and write the
processing as a separate program. It will probably be easier to pre-format
the data with a scripting language anyways.
Thank you for taking your time to reply. :)
- Fuu
--
View this message in context:
Hello Mikhail,
Thank you for the reply. In terms of user
experience, I want to spread out the products from same brand farther from
each other, *atleast* in the first 50-100 results we display. I am thinking
about two different approaches as solution.
At the Lucene level the index should be 100% compatible, but I don't know
with 100% certainty whether there may be subtle changes in any field type
analyzers or token filters, such as in the example schema. You might want to
read SOLR-2519 and see whether your fields and field types may be
Hi there,
I am new to solr et all. Besides I am a java noob.
What I am doing:
I want to do full text retrival on office documents. The metadata of
these documents are maintained in Postgesql.
So the only intormation I need to get out of solr is a documet ID.
My problem no is, that my index
Dear Robert,
could you give me a little more information about your setting? For example the
complete solrconfig.xml and
the complete schema.xml would definitely help.
Best,
Sven
--
kippdata informationstechnologie GmbH
Sven Maurmann Tel: 0228 98549 -12
Bornheimer Str. 33a
The CHANGES.txt file (make sure to look in the Lucene version as well
as Solr) will
have, for each new version, a section about upgrading from that
should answer
for you...
Best
Erick
On Mon, Aug 20, 2012 at 3:13 AM, Dominique Bejean
dominique.bej...@eolya.fr wrote:
Hi,
I think the
Tanguy,
You idea is perfect for cases where there is a too many
documents with 80-90% documents having same value for a particular field.
As an example, your idea is ideal for, lets say we have 10 documents in
total like this,
doc1 : merchantName Kellog's /merchantName
doc2 :
How are you ingesting the offic documents? SolrCell, or some other method?
Do you have CopyFields? What fields are you querying on?
What does your text field type look like?
-- Jack Krupansky
-Original Message-
From: robert rottermann
Sent: Monday, August 20, 2012 10:39 AM
To:
We are using SOLR and are in the process of adding custom filter factory to
handle the processing of words/tokens to suit our needs.
Here is what our custom filter factory does
1) Reads the tokens and does some analysis and writes the result of analysis
to database.
We are using Embedded Solr
First, the obvious question: What kind of information? Be specific.
Second, you can pass parameters to your filter factory in your field type
definitions. You could have separate schemas or separate field types for the
different indexes. Is there anything this doesn't cover?
You can also
Thanks Jack.
The information I want to pass is the databasename into which the analyzed
data needs to be inserted.
As i was saying earlier, the set up we have is
1) we use embedded solr server with multi cores - embedded into our webapp
2) support one index for each client - each client has a
-Original message-
From:ksu wildcats ksu.wildc...@gmail.com
Sent: Mon 20-Aug-2012 20:28
To: solr-user@lucene.apache.org
Subject: Re: Solr Custom Filter Factory - How to pass parameters?
Thanks Jack.
The information I want to pass is the databasename into which the analyzed
Thank you to both of you.
Le 20/08/12 17:28, Erick Erickson a écrit :
The CHANGES.txt file (make sure to look in the Lucene version as well
as Solr) will
have, for each new version, a section about upgrading from that
should answer
for you...
Best
Erick
On Mon, Aug 20, 2012 at 3:13 AM,
Hello,
I don't believe your task can be solved by playing with scoring/collector
or shuffling.
For me it's absolutely Grouping usecase (despite I don't really know this
feature well).
Grouping cannot solve the problem because I dont want to limit the number
of results showed based on the
Thanks Markus.
Links are helpful. I will give it a try and see if that solves my problem.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Custom-Filter-Factory-How-to-pass-parameters-tp4002217p4002248.html
Sent from the Solr - User mailing list archive at Nabble.com.
NRT does not work because index updates hundreds times per second vs.
cache warm-up time few minutes and we are in a loop
allowing you to query
your huge index in ms.
Solr also allows to query in ms. What is the difference? No one can sort
1,000,000 terms in descending counts order faster
I am trying to use shingles and position filter to make a query for
foot print, for example, match either foot print or footprint.
From the docs: using the PositionFilter
http://wiki.apache.org/solr/PositionFilter in combination makes it
possible to make all shingles synonyms of each other.
Hi All,
I have a problem (Yonik, please!) help me, what is Term count limits? I
possibly have 256,000,000 different terms in a field or 16,000,000? Can I
temporarily disable tho feature?
Thanks!
2012-08-20 16:20:19,262 ERROR [solr.core.SolrCore] - [pool-1-thread-1] - :
Hello,
Does anyone have the grammar file (.jj file) for the complex phrase query
parser. The patch from https://issues.apache.org/jira/browse/SOLR-1604 does
not have the grammar file as part of it.
Thanks,
Phani.
--
View this message in context:
Hi Mikhail,
You are correct. [+] show 6 result.. will work but
it wouldn't suit my requirements. This is a question of user experience
right?
Imagine if the product manager comes to you and says I dont want to see
[+] show 6 result.. and I want the results to be diverse but
Hi folks. I read some posts in the past about this subject but nothing that
definitively answer my question.
I am trying to understand the trade off when you use a large number of fields
(now sure what a quantative value of large is in Solr .. say 200 fields) versus
a join - and even a multi
Hi All,
I have a problem (Yonik, please!) help me, what is Term count limits? I
possibly have 256,000,000 different terms in a field or 16,000,000?
Thanks!
2012-08-20 16:20:19,262 ERROR [solr.core.SolrCore] - [pool-1-thread-1] - :
org.apache.solr.common.SolrException: Too many values for
I have for example jobs form country A, jobs from country B and so on until
100 countries. I need to have for each country an separate index, because if
someone search for jobs in country A I need to query only the index for
country A. How to solve this problem?
Ah! Will the text be in different
hi lance,
how would that work? generation is essentially versioning right?
i also don't see why you need to use zk to do this as it's all on a single
machine, was hoping for a simpler solution :)
On Sun, 19 Aug 2012 19:26:41 -0700, Lance Norskog goks...@gmail.com
wrote:
I would use generation
It appears that there is a hard limit of 24-bits or 16M for the number of
bytes to reference the terms in a single field of a single document. It
takes 1, 2, 3, 4, or 5 bytes to reference a term. If it took 4 bytes, that
would allow 16/4 or 4 million unique terms - per document. Do you have
Is this required by your application? Is there any way to reduce the
number of terms?
A work around is to use shards. If your terms follow Zipf's Law each
shard will have fewer than the complete number of terms. For N shards,
each shard will have ~1/N of the singleton terms. For 2-count terms,
Yes, by generations I meant versioning. The problem is that you have
to have a central holder of the current generation number. ZK does
this very well. It is a distributed synchronized file system for very
small files. If you have a more natural place to store the current
generation number, that's
Join works best with a small number of unique values. Unfortunately,
people often want to join on uniqueKey, which is by definition
unique per document.
The usual advice is to first try to flatten your data as much as possible.
There's also some ongoing work on block joins that you may want to
Usually, search results are sorted by their score (how well the document
matched the query), but it is common to need to support the sorting of
supplied data too.
Boosting affects the scores of matching documents in order to affect ranking
in score-sorted search results. Providing a boost value,
Hi guys,
From http://wiki.apache.org/solr/MergingSolrIndexes, it said 'Using
srcCore, care is taken to ensure that the merged index is not corrupted
even if writes are happening in parallel on the source index'.
What does it means? If there are deletion request during merging, will this
37 matches
Mail list logo