Hoss, I'm so happy you realized the problem because I was quite worried
about it!!
Let me know if I can provide support with testing it.
The last two days I was busy with migrating a bunch of hosts which
should -hopefully- be finished today.
Then I have again the infrastructure for running
One more thanks for posting this!
I struggled with the same issue yesterday and solved it with _version_ hint
from mailing list .
Alex.
-Original Message-
From: Mark Mandel [mailto:mark.man...@gmail.com]
Sent: Thursday, September 06, 2012 1:53 AM
To: solr-user@lucene.apache.org
Hi,
Thanks,
Iam getting the results with below url.
*suggest/?q=michael bdf=titledefType=lucenefl=title*
But, i want the results in spellcheck section.
i want to search with title or empname or both.
Aniljayanti
--
View this message in context:
Hi
I am trying to implement some auto suggest functionality, and am currently
looking at the terms component (Solr 3.6).
For example, I can form a query like this:
http://solrhost/solr/mycore/terms?terms.fl=title_sterms.sort=indexterms.limit=5terms.prefix=Hotel+C
which searches in the title_s
If your interest is focusing on the real textual content of a web page, you
could try this : JReadability (https://github.com/ifesdjeen/jReadability ,
Apache 2.0 license), which wraps JSoup (as Lance suggested) and applies a
set of predefined rules to scrap crap (nav, headers, footers, ...) off of
Hi Peter,
Yes if you want to do complex things in suggest mode, you'd better rely on
the SearchComponent...
For example, this blog post is a good read
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ ,
if you have complex requirements on the searched fields.
(Although
Commit is not too often, it's a batch of 100 records, takes 40 to 60 secs
before another commit.
No I am not indexing with multi threads. It uses a single thread executor.
I have seen steady performance for now after increasing the merge factor
from 10 to 25.
Will have to wait and watch if that
Hi
I have installed solr 3.6.1 on tomcat 7.0 following the steps here.
http://ralf.schaeftlein.de/2012/02/10/installing-solr-3-5-under-tomcat-7/
The slor home page loads fine but the admin page
(http://localhost:8080/solr/admin/) throws error missing core name in path.
I am installing single
Hello,
i'm currently devoloping a custom component in Solr.
This component works fine. The problem I have is, I only have an access to the
searcher which gives me the option to fire e.g. BooleanQueries.
This searcher gives me a result, which I have to iterate to calculate
informations which
Hi,
just found a solution, but you have to know, what you want to count:
try {
final SolrIndexSearcher s = rb.req.getSearcher();
final SolrQueryParser qp = new SolrQueryParser(rb.req.getSchema(), null);
final String queryString = entity_type:RELEASE;
final Query q = qp.parse(queryString);
Hello,
I was under the impression that edismax was supposed to be crash proof
and just ignore bad syntax. But I am either misconfiguring it or hit a
weird bug. I basically searched for text containing '/' and got this:
{
'responseHeader'={
'status'=400,
'QTime'=9,
'params'={
As far as I understand, / is a special character and needs to be escaped.
Maybe foo\/bar should work?
I found this when I looked at the code of ClientUtils.escapeQueryChars:
// These characters are part of the query syntax and must be escaped
if (c == '\\' || c == '+' || c == '-' || c ==
I believe this is caused by the regex support in
https://issues.apache.org/jira/browse/LUCENE-2039
It certainly seems wrong to interpret a slash in the middle of the
word as the start of a regex, so I've reopened the issue.
-Yonik
http://lucidworks.com
On Thu, Sep 6, 2012 at 9:34 AM, Alexandre
Thanks Rafał and Markus for your comments.
I think Droids it has serious problem with URL parameters in current version
(0.2.0) from Maven central:
https://issues.apache.org/jira/browse/DROIDS-144
I knew about Nutch, but I haven't been able to implement a crawler with it.
Have you done that or
You have deletedPKQuery, but the correct spelling is deletedPkQuery
(lowercase k). Try that and see if it fixes your problem.
Also, you can probably simplify this if you do this as
command=full-importclean=false, then use something like this for your query:
select product_id as
That's what I was thinking, but when I tried foo/bar in Solr 3.6 and
4.0-BETA it was working fine - it split the term and generated the proper
query without any error.
I think the problem is if you use the default Lucene query parser, not
edismax. I removed defType==edismax from my query
I am on 4.0 alpha. Maybe it was fixed in beta. But I am most
definitely seeing this in edismax. If I get rid of / and use
debugQuery, I get:
'responseHeader'={
'status'=0,
'QTime'=14,
'params'={
'debugQuery'='true',
'indent'='true',
'q'='foobar',
'qf'='TitleEN
Hello!
I think that really depends on what you want to achieve and what parts
of your current system you would like to reuse. If it is only HTML
processing I would let Nutch and Solr do that. Of course you can
extend Nutch (it has a plugin API) and implement the custom logic you
need as a Nutch
I do in fact see your problem with an earlier 4.0 build, but not with
4.0-BETA.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Thursday, September 06, 2012 10:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.0alpha: edismax complaints on certain
-Original message-
From:Lochschmied, Alexander alexander.lochschm...@vishay.com
Sent: Thu 06-Sep-2012 16:04
To: solr-user@lucene.apache.org
Subject: AW: Website (crawler for) indexing
Thanks Rafał and Markus for your comments.
I think Droids it has serious problem with URL
The fix in edismax was made just a few days (6/28) before the formal
announcement of 4.0-ALPHA (7/3), but unfortunately the fix came a few days
after the cutoff for 4.0-ALPHA (6/25).
See:
https://issues.apache.org/jira/browse/SOLR-3467
(That issue should probably be annotated to indicate that
: gpg: Signature made 08/06/12 19:52:21 Pacific Daylight Time using RSA key
: ID 322
: D7ECA
: gpg: Good signature from Robert Muir (Code Signing Key) rm...@apache.org
: *gpg: WARNING: This key is not certified with a trusted signature!*
: gpg: There is no indication that the signature
: Some extra information. If I use curl and force it to use HTTP 1.0, it is more
: visible that Solr doesn't allow persistent connections:
a) solr has nothing to do with it, it's entirely something under the
control of jetty the client.
b) i think you are introducing confusion by trying to
Thank you. I did the test with curl the same way you did it and it works.
I still can not get ab (apache benchmark) to reuse connections to
solr. I'll investigate this further.
$ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Alive
Keep-Alive requests:0
-- Aleksey
On
Hey Guys,
I created a program to export Solr index data to XML.
The url is https://github.com/eltu/Solr-Export
Tell me about any problem, please.
*** I only tested with the Solr 3.6.1
Thanks,
Helton
I have a made a schema change to copy an existing field name (Source Field)
to an existing search field text (Destination Field).
Since I made the schema change, I updated all the documents thinking the new
source field will be clubbed together with the text field. The search for
a specific
We have a distributed solr setup with 8 servers and 8 cores on each server in
production. We see this error multiple times in our solr servers. we are
using solr 3.6.1. Has anyone seen this error before and have you resolved
it ?
2012-09-04 02:16:40,995 [http-nio-8080-exec-7] ERROR
Hi Jack,
24bit = 16M possibilities, it's clear; just to confirm... the rest is
unclear, why 4-byte can have 4 million cardinality? I thought it is 4
billions...
And, just to confirm: UnInvertedField allows 16M cardinality, correct?
On 12-08-20 6:51 PM, Jack Krupansky
Hi Lance,
Use case is keyword extraction, and it could be 2- and 3-grams (2- and
3- words); so that theoretically we can have 10,000^3 = 1,000,000,000,000
3-grams for English only... of course my suggestion is to use statistics and
to build a dictionary of such 3-word combinations (remove top,
It's actually limited to 24 bits to point to the term list in a
byte[], but there are 256 different arrays, so the maximum capacity is
4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so
the real limit can come in at a little less due to luck.
From the comments:
* There is
Hi,
I am using Solr with DIH and started getting errors when the database
time/date fields are getting imported in to Solr. I have used the date as
the field type but when i looked up at the docs it looks like the date
field does not accept (Thu, 06 Sep 2012 22:32:33 +) or (1346976590)
: I am using Solr with DIH and started getting errors when the database
: time/date fields are getting imported in to Solr. I have used the date as
what actual error are you getting?
If you are pulling dates from a SQL Date field, that the jdbc driver
returns as java.util.Date objects, then
http://www.electrictoolbox.com/article/mysql/format-date-time-mysql/ hth --
H
On 6 Sep 2012 17:23, kiran chitturi chitturikira...@gmail.com wrote:
Hi,
I am using Solr with DIH and started getting errors when the database
time/date fields are getting imported in to Solr. I have used the date
: I am facing a strange problem. I am searching for word jacke but solr also
: returns result where my description contains 'RCA-Jack/'. Íf i search
: jacka or jackc or jackd, it works fine and does not return me any
: result which is what i am expecting in this case.
you need to tell us what
I don't know for sure, but I remember something around this being a problem,
yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?
Otis
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
- Original Message -
From: Walter Underwood
Hi,
Thank you for your response.
The error i am getting is 'org.apache.solr.common.SolrException: Invalid
Date String: '1345743552'.
I think it was being saved as a string in DB, so i will use the
DateFormatTransformer.
When i index a text field which has arabic and English like this tweet
Hey guys!
I've been attempting to get solrcloud set up on a ubuntu vm, but I believe
I'm stuck.
I've got tomcat setup, the solr war file in place, and when I browser to
localhost:port/solr, I can see solr. CHECK
I've set the zoo.cfg to use port 5200. I can start it up and see it's
running (ls
Yes, that is exactly the bug. EdgeNgram should work like the synonym filter.
wunder
On Sep 6, 2012, at 5:51 PM, Otis Gospodnetic wrote:
I don't know for sure, but I remember something around this being a problem,
yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?
Otis
Greetings,
I'm looking to add some additional logging to a solr 3.6.0 setup to
allow us to determine actual time spent by Solr responding to a
request.
We have a custom QueryComponent that sometimes returns 1+ MB of data
and while QTime is always on the order of ~100ms, the response time at
the
On 7 September 2012 06:24, kiran chitturi chitturikira...@gmail.com wrote:
[...]
When i index a text field which has arabic and English like this tweet
“@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟”
#gcc #ksa #lebanon #syria #kuwait #egypt #سوريا
with field_type
I'd still love to see a query lifecycle flowchart, but, in case it
helps any future users or in case this is still incorrect, here's how
I'm tackling this:
1) Override default json responseWriter with my own in solrconfig.xml:
queryResponseWriter name=json
Also, your browser may use a platform default for the encoding instead of
UTF-8. Some MacOS and Windows browsers have this problem.
Tomcat sometimes needs adjustment to use UTF-8. If you are on tomcat, check
this:
http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrTomcat
Grouping isn't defined for tokenized fields I don't think. See:
http://wiki.apache.org/solr/FieldCollapsing where it says for
group.field:
..The field must currently be single-valued...
Are you sure you don't want faceting?
Best
Erick
On Tue, Sep 4, 2012 at 5:27 AM, mechravi25
Try using edismax to distribute the search across the fields rather
than using the catch-all field. There's no way that I know of to
reconstruct what field the source was.
But storing the source fields without indexing them is OK too, it won't affect
searching speed noticeably...
Best
Erick
On
I don't know of any better way to do this. Conflating the fields is
not _that_ error prone, although it is annoying I agree. I think that
idea is better than storing them separately.
Best
Erick
On Tue, Sep 4, 2012 at 4:58 PM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
Hello,
I have some
And you've illustrated my viewpoint I think by saying
two obvious choices.
I may prefer the first, and you may prefer the second. Neither is
necessarily more correct IMO, it depends on the problem
space. Choosing either one will be unpopular with anyone
who likes the other
And I suspect that
Securing Solr pretty much universally requires that you only allow trusted
clients to access the machines directly, usually secured with a firewall
and allowed IP addresses, the admin handler is the least of your worries.
Consider if you let me ping solr directly, I can do something really
Guenter:
Are you using SolrCloud or straight Solr? And were you updating in
batches (i.e. updating multiple docs at once from SolrJ by using the
server.add(doclist) form)?
There was a bug in this process that caused various docs to show up
in various shards differently. This has been fixed in
48 matches
Mail list logo