On Wed, Jul 31, 2013 at 4:56 AM, Bill Bell billnb...@gmail.com wrote:
On Jul 30, 2013, at 12:34 PM, Dotan Cohen dotanco...@gmail.com wrote:
On Tue, Jul 30, 2013 at 9:21 PM, Aloke Ghoshal alghos...@gmail.com wrote:
Does adding facet.mincount=2 help?
In fact, when adding facet.mincount=20 (I
On Tue, Jul 30, 2013 at 9:21 PM, Aloke Ghoshal alghos...@gmail.com wrote:
Does adding facet.mincount=2 help?
In fact, when adding facet.mincount=20 (I know that some dupes are in
the hundreds) I got the OutOfMemoryError in seconds instead of
minutes.
--
Dotan Cohen
http://gibberish.co.il
them out.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
longer! Also, I fear
that this is not a one-time problem, rather, that I should already
learn how to deal with tuning Solr for intensive queries as such. I
learn by the problems encountered!
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Tue, Jul 30, 2013 at 9:56 PM, Shawn Heisey s...@elyograg.org wrote:
On 7/30/2013 12:49 PM, Dotan Cohen wrote:
Thanks, the query ran for almost 2 full minutes but it returned
results! I'll google for how to increase the disk cache for queries
like this. Other than the Qtime
into that.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
here then it's a new document
return super.addDoc(cmd);
}
}
And I give a bunch of examples in my book.
I anticipate the book with esteem!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
times matched)
eat (2 times matched)
love, cake, you, will, candy (1 time each)
Thanks!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
Thank you Jack and Koji. I will take a look at MLT and also at the
.zip files from LUCENE-474. Koji, did you have to modify the code for
the latest Solr?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
OR this OR that OR left
OR right OR north OR south OR east OR west/str/lst/lst
result name=response numFound=22495012 start=0
My index currently has 77461952 documents, most under 1 KiB each but
upwards of ten fields.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
/1449359957/
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
on. The issue remains the same even when reversing
the order of the pivot:
facet.pivot=provider,added
Is this a Solr bug, or am I pivoting wrong? This is on Solr 4.1.0
running on OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) on
Ubuntu Server 12.04. Thank you!
--
Dotan Cohen
http://gibberish.co.il
terms in the main query returns results in
miliseconds.
Note that I am not using any wildcard queries, in each case I am
specifying the field to search and the terms to search on. Where
should I start to debug?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Thu, Jun 27, 2013 at 12:14 PM, Upayavira u...@odoko.co.uk wrote:
can you give an example?
Thank you. This is an example query:
select
?q=search_field:iraq
fq={!cache=false}search_field:love%20obama
defType=edismax
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
, but not to filed size in words.
Thank you.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that in fact begin with 'dotan-', even if a document has other tags
such as 'beatles'?
4) How to have Solr return only those faceting values which are larger than 0?
Thank you!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
facet.mincount - looks like you want to set it to 1,
instead of the default which is 0.
Perfect, thank you Raymond!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Wed, Jun 5, 2013 at 3:41 PM, Brendan Grainger
brendan.grain...@gmail.com wrote:
Hi Dotan,
I think all you need to do is add:
facet.mincount=1
i.e.
select?q=*:*fq=tags:dotan-*facet=truefacet.field=tags
rows=0facet.mincount=1
Note that you can do it per field as well:
with
_both_ term1 _and_ term2, which could be between 0-10 documents.
Note that in the application, users will be searching for any
arbitrary number of terms, in fact they will be entering phrases. I
can limit these phrases to 140 characters if needed.
Thank you in advance!
--
Dotan Cohen
http
On Wed, Jun 5, 2013 at 6:10 PM, Shawn Heisey s...@elyograg.org wrote:
On 6/5/2013 9:03 AM, Dotan Cohen wrote:
How would one write a query which should perform set union on the
search terms (term1 OR term2 OR term3), and yet also perform phrase
matching if both terms are found? I tried a few
my need, though.
Thanks!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that the ExtendedDisMax page does in fact mention that fuzziness
is supported:
http://wiki.apache.org/solr/ExtendedDisMax#Query_Syntax
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Wed, Jun 5, 2013 at 9:04 PM, Eustache Felenc
eustache.fel...@idilia.com wrote:
There is also http://wiki.apache.org/solr/SolrRelevancyCookbook with nice
examples.
Thank you.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
of, such as the case in this thread, where I was parsing entire
documents to change the multiField value.
Thank you very much!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
manual that is particularly urgent that I should read, please do
mention it. Thanks!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
such a doc...:
doc
arr name=tags
stra/str
strb/str
/arr
/doc
...with this doc...:
doc
arr name=tags
stra/str
/arr
/doc
...has no effect? Can multiValue fields be only added, but not removed?
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that will
hopefully help.
Thank you Shawn!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
this flexibility, unless I maintained two separate clouds.
Thank you. I am not using Solr Cloud but if I ever consider it, then I
will keep this in mind.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
translating that into the JSON format that
I work with.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
are
making.
Upayavira
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
seems to be the best answer, to date - but no guarantees!
I don't have an answer for that, sorry!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
is happening, clients will be performing
searches and a few hundred documents will be written per minute. Note
that the machine running Solr is an EC2 instance running on Amazon Web
Services, and that the 'disk' on which the Solr index is stored in an
EBS volume.
Thank you.
--
Dotan Cohen
http
before
making another request.
Actually, I would add a filter query for documents whose last_index
value is before the last schema change, and stop when less documents
were returned than were requested.
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
with
removing and adding fields to the schema has shown almost no change in
the extant index results returned.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
word. By
definition, they are a new value for the term text.
I see, for some reason I did not concentrate on this key quote of yours:
...to remove the tokens that did not produce a stem ...
Now it makes perfect sense.
Thank you, Jack!
--
Dotan Cohen
http://gibberish.co.il
http://what
the
RemoveDuplicatesTokenFilterFactory? That seems odd.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
log as SEVERE. I thought that this
would be easy to Google for, but it is not! If there is a concise
document that examines this issue, I would love to know where on the
wild wild web it exists.
Thank you.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Wed, Apr 3, 2013 at 8:47 PM, Shawn Heisey s...@elyograg.org wrote:
On 4/2/2013 3:09 AM, Dotan Cohen wrote:
I notice that this only occurs on queries that run facets. I start
Solr with the following command:
sudo nohup java -XX:NewRatio=1 -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX
up the entire heap.
Silently dropping data is by far the worse choice, I agree, especially
as a default setting.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
practice: a single default should be decided upon and Solr
should use this value when nothing is specified in solrconfig.xml, and
that _same_value_ should be specified in the stock solrconfig.xml. Is
it not a reasonable assumption that this would be the case?
--
Dotan Cohen
http://gibberish.co.il
http
as it contains
so many examples?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
once per minute?
Even better, it sounds like a job for CommitWithin :
http://wiki.apache.org/solr/CommitWithin
I'll look into that. Thank you!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
because I actually need the top keywords related to a
specific keyword. For instance, I need to know which words are most
commonly used with the word coffee.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
and the
following Java:
$ java -version
java version 1.6.0_27
OpenJDK Runtime Environment (IcedTea6 1.12.3) (6b27-1.12.3-0ubuntu1~12.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
will start with -Xmx8g and test.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
.
Thank you.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Tue, Apr 2, 2013 at 5:33 PM, Toke Eskildsen t...@statsbiblioteket.dk wrote:
On Tue, 2013-04-02 at 15:55 +0200, Dotan Cohen wrote:
[Tokd: maxWarmingSearchers limit exceeded?]
Thank you Toke, this is exactly on my list of things to learn about
Solr. We do get the error mentioned and we
times a minute, and a
commit is done after each batch since I'm calling Solr as such:
http://127.0.0.1:8983/solr/core/update/json?commit=true
Should I remove commit=true and run a cron job to commit once per minute?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
to determine what the top keywords / topics are. That query
would take up to 200 seconds to run, but it does not have to return
the results in real-time (the output goes to another process, not to a
waiting user).
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
much longer to run (~60-80 ms) than the queries without (~1-4 ms). I
figured that the problem may have been with the caching.
In fact, running a query with a filter query and caching disabled is
running in the range of 16-30 ms, which is quite an improvement.
Thanks.
--
Dotan Cohen
http
Server. Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
. Actually, running a clean 4.1 with no previous index
does not have the issues.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
are you suggesting for this issue? Can
Solr 4.2 natively import from a Solr index?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
changed. Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
this.
Note that we are talking about ~18,000,000 (yes, 18 million) small
documents similar to 'tweets' (mostly under 1 KiB each, very very few
over 5 KiB).
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
them into
Solr 4.1, then requests another N documents to index? Or is there
internal Solr / Lucene facility for this? I've actually looked for
such a facility, but as I am unable to find such a thing I ask.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Fri, Mar 1, 2013 at 12:22 PM, Rafał Kuć r@solr.pl wrote:
Hello!
As far as I know you have to re-index using external tool.
Thank you Rafał. That is what I figured.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
. FWIW you
can find the same question and my response on Stackoverflow.
~ David
Thank you David. In fact I do frequent Stack Overflow.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
inside.
Hi Alex. Would you mind posting the new analyzers?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
checking which classes are available?
Thank you.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
for Phoronix, and much more
relevant for some readers than Jack-the-Ripper or Quake.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
, not on
the Websolr account. But I commend you taking notice and taking an
interest. Thank you!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that produces the above form.
Thank you Shawn, that is much cleaner and will be easier to debug when
/ if things go wrong.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Wed, Nov 7, 2012 at 5:16 PM, Walter Underwood wun...@wunderwood.org wrote:
You are probably thinking of SweetSpotSimilarity. You might also want to look
at pivoted document normalization.
Thanks, I'll take a look at that.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
smoother. Try doing that.
Otis
Thanks, Otis. I'll start googling for Solr and Lucene Similarity.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
, but new
documents are being added that ostensibly should have that value.
I'll try adding a document with post.jar and see what happens. I'll
update the thread.
Thanks!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
again?
Thanks. Seeing how all the documents are being added, either there is
a valid format in the created_iso8601 field or it is empty. I've
pretty much ruled out empty in code, but still nothing in the index.
I'll play around some more and update the list. At least I am learning.
--
Dotan
wrong, I'll get to looking at that
now. I'm surprised that Solr accepts the documents with bad data in
some of the fields, I will look into that too as well.
Have a peaceful Saturday.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
in that
field comes first or last (I don't know which).
Thank you. In fact, I am being careful to try to pull up records after
the date in which the application was updated to populate the field.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
documents that do not have a value for a field, that's the purpose of
sortMissingFirst/Last.
All the newest documents have the field, and I'm sorting by time
descending. In fact, I did test with more rows, but for the mailing
list I wanted the output to be concise.
Thanks.
--
Dotan Cohen
http
full-text fields or fields that
need an index-time boost need norms.
http://wiki.apache.org/solr/SchemaXml
Thank you, but I am looking for a query-time modifier. I do need the
fieldNorm enabled in the general sense.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
not apply on-the-fly score computation component
coefficients. Surely I'm not the first dev to run into an issue with
the default scoring algorithm and want to tweak it only on specific
queries!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that route I would have it recognize some
LocalParams (such as omitNorms=true right there) to be flexible at
query time. I'm actually surprised that this doesn't yet exist.
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
defined on handlers that may list the fields to
return as Alexandre had mentioned.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
are exactly the documents that I need. Google
should hire you to fill in the pages when someone searches for java
garbage collection. Interestingly, I just check and bing.com does
list the Oracle page on the first pager of results. I shudder to think
that I might have to switch search engines!
--
Dotan
.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
the
effect.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
0 0
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
-filterCache
CACHE-queryResultCache
CORE-searcher
Thanks,
Shawn
Thank you Shawn. The information is here:
http://pastebin.com/aqEfeYVA
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
immediately available for search.
Thanks, Erick. I'll play around with different configurations. So far
just removing the periodic optimize command worked wonders. I'll see
how much it helps or hurts to run that daily or more or less frequent.
--
Dotan Cohen
http://gibberish.co.il
http
.22_mean_in_my_logs.3F
I happen to know that the script will try to commit once every 60
seconds. How does one reduce the work in newSearcher listeners? What
effect will this have? What effect will reducing the autowarmCount on
caches have?
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http
and I cannot
find the cache statistics. Where are they?
Lowering the autowarmCount should lower the time needed to warm up,
howere you can also look at your warming queries (if you have such)
and see how long they take.
Thank you, I will look at that!
--
Dotan Cohen
http://gibberish.co.il
On Mon, Oct 22, 2012 at 5:27 PM, Mark Miller markrmil...@gmail.com wrote:
Are you using Solr 3X? The occasional long commit should no longer
show up in Solr 4.
Thank you Mark. In fact, this is the production release of Solr 4.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey s...@elyograg.org wrote:
On 10/22/2012 9:58 AM, Dotan Cohen wrote:
Thank you, I have gone over the Solr admin panel twice and I cannot find
the cache statistics. Where are they?
If you are running Solr4, you can see individual cache autowarming
the best use of that for Solr assuming both heavy reads
and writes?
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
cached
Mem:14 2 12 0 0 1
-/+ buffers/cache: 0 14
Swap:0 0 0
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
Server, it was
called force merge and we had to tell people to stop doing that nearly
every month.
Thank you for those links. I commented on the Solr bug. There are some
very insightful comments in there.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
to experiment with the warning. Thank you for the tips.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
of
those errors without seeing the exception.
I see, thanks. I don't think that I'm using the SolrCloud feature. Is
it enable because there exist solr/collection1 and also
multicore/core0?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
and forget about it.
You don't need to do anything with it as it is used internally by
Solr.
That is exactly my plan, but I would also like to understand more
about what is going on. I don't like cut-and-paste programming.
Thank you very much!
--
Dotan Cohen
http://gibberish.co.il
http
to the example schema.xml file provided with
Solr, then I'd love to. I'm signing up for the dev list now. Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
a Solr 4 Beta index running on Websolr that does not have
such a field. It works, but throws many Service Unavailable and
Communication Error errors. Might the lack of the _version_ field be
the reason?
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
From where did the additional relative paths 'collection1',
'collection1/data', and 'collection1/data/index' come from? I know
that I can change the value of CWD with the -Dsolr.solr.home flag, but
what affects the relative paths mentioned?
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what
which is how Solr knows about the
relative paths. There should also be a README.txt file that will tell you
more about how the directory is expected to be organized.
Cheers,
Tricia
Thanks. I read the top-level README.txt but now I see that the answer
is in the solr/README.txt file.
--
Dotan
in
the production application and in production schema do in fact match!
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
that the
highlighted sections come after the main results. The highlighting
feature works as expected.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
emglitters is gold/em
Whereas I need:
strall that glitters is gold/str
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
fields may reduce the data transferred by an order
of magnitude.
Thanks.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
with Solr-PHP-Client. In fact,
preceding the variable with (int) does in fact resolve the issue I
have found. This looks like an issue with PHP being weakly typed.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
1 - 100 of 109 matches
Mail list logo