Well some of the searches take minutes.
Below are some stats about this particular index that I am talking about:
Index size = 400GB (Using CommonGrams so without that the index is around
180GB)
Position File = 280GB
Total Docs = 170 million (just indexed for searching - for highlighting
On 3/12/2014 6:27 PM, Erick Erickson wrote:
Wondering if 4.7 is a natural point to do this.
See Uwe's announcement that as of Solr 4.8,
Solr/Lucene will _require_ Java 1.7 rather than
Java 1.6.
I know some organizations will not be able to
make this transition easily, thus I suspect
I am using Solr 4.6.0 in cloud mode. The setup is of 4 shards, 1 on each
machine with a zookeeper quorum running on 3 other machines. The index size
on each shard is about 15GB. I noticed that the number of segments in
second shard was 42 and in the remaining shards was between 25-30.
I am
Hi Erick,
I've used the fl=id parameter to avoid retrieving the actual documents
(step 4 in your mail) but the problem still exists.
Any ideas on how to find the merging time(step 3)?
Remi
On Tue, Mar 11, 2014 at 7:29 PM, Erick Erickson erickerick...@gmail.comwrote:
In SolrCloud there are a
Hi Varun,
I would just like to say that I have the same two problems you've mentioned
and I couldn't figure out a way to solve them.
For the 2nd I've posted a question a couple of days ago, title: Result
merging takes too long
Remi
On Thu, Mar 13, 2014 at 3:44 PM, Varun Rajput
Hello Vijay,
You can try FieldCollepsing, Join, Block-join, or just concatenate both
field and search for concatenation.
On Thu, Mar 13, 2014 at 7:16 AM, Vijay Kokatnur kokatnur.vi...@gmail.comwrote:
Hi,
I've inherited an Solr application with a Schema that contains parent-child
Hi guys,
The following code
server.queryAndStreamResponse(new SolrQuery(*:*), new
StreamingResponseCallback() {
public void streamSolrDocument(SolrDocument doc) {
}
public void streamDocListInfo(long numFound, long start, Float maxScore) {
}
});
throws
Caused by:
Oh yes, i see what you mean. I would try SOLR-1632 and have distributed IDF,
but it seems to be broken now.
-Original message-
From:Steven Bower smb-apa...@alcyon.net
Sent: Wednesday 12th March 2014 21:47
To: solr-user solr-user@lucene.apache.org
Subject: Re: IDF maxDocs / numDocs
Hi,
I have solr index directory in a machine. I want a second solr instance on
a different server to use this index. Is it possible to specify the path of
a remote machine for data directory.
Thanks,
Prasi
Prasi,
It is not possible to use the index files of one solr instance for the second
instance. The reason behind this is while booting the solr instance it will get
lock the schema and index files to make sure other instance won't update the
index and schema files.
As you mentioned like want
Hi, thank you, when it is good for visual review it is hard to work with this
data. What I need is to build something like this:
| Name | Twitter Profile | Topics | Site Title | Site Description |
Site content |
| John Doe | Yes| No | Yes | No
On Thu, Mar 13, 2014 at 10:41 AM, Marius Dumitru Florea
mariusdumitru.flo...@xwiki.com wrote:
Hi guys,
The following code
server.queryAndStreamResponse(new SolrQuery(*:*), new
StreamingResponseCallback() {
public void streamSolrDocument(SolrDocument doc) {
}
public void
Hi,
I am trying to fetch all the record for 2005
I have field(int) pubdateraw: 20130508
Not working - select?q=pubdateraw:/2013*/
Not working - select?q=pubdateraw:/.2013*./
Is it possible to have regex on int field in solr 4.5??
to get the record with 20130508 how am i suppose to write my
Hi Priti,
Thats an interesting question, I wonder the answer by myself too. Does prefix
query work with int?
q=pubdateraw:2013* ?
By mean time, as a workaround, try range queries. q=pubdateraw:{20130101 TO
20131231}
On Thursday, March 13, 2014 12:45 PM, Priti Solanki pritiatw...@gmail.com
Regular expressions is a text-matching mechanism, so you shouldn't expect
to be able to use it on numeric data. If your timestamps are of the form
you indicate, you should be able to filter on pubdateraw:[2005 TO
2005].
On Thu, Mar 13, 2014 at 11:45 AM, Priti Solanki
Both works!!
pubdateraw:[2005 TO 2005]
pubdateraw:[20050101 TO 20051231]
Thanks Raymond for sharing the useful info as well.
On Thu, Mar 13, 2014 at 4:30 PM, Raymond Wiker rwi...@gmail.com wrote:
Regular expressions is a text-matching mechanism, so you shouldn't expect
to be able to
I have given up this idee and made a wrapper which adds a fq with the userroles
to each request
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Dienstag, 11. März 2014 23:32
To: solr-user@lucene.apache.org
Subject: use local param in solrconfig fq for access-control
i
1. What is your solr version? In 4.x family the proximity searches have
been optimized among other query types.
2. Do you use the filter queries? What is the situation with the cache
utilization ratios? Optimize (= i.e. bump up the respective cache sizes) if
you have low hitratios and many
I have gotten nearly everything to work. There are to queries where i dont get
back what i want.
avaloq frage 1- only returns if i set minGramSize=1 while
indexing
yh_cug- query parser doesn't remove _ but the
indexer does (WDF) so there is no match
Is
When I index a pdf I would like to manually add the document's title in a
filed named rmDocumentTitle.
I defined the filed in the schema.xml, but when I query Solr I see that the
field was not created...
Do I make something wrong?
Below the code snippet, schema and solrconfig.xml
Thank you
I tried to define a new field test in the schema (field name=test
type=string indexed=true stored=true multiValued=true/) and added
req.setParam(literal.test, test title); in the code.
The field (test) is there O_O.
Can someone explain me the difference? Why rmDocumentTitle is not there while
Ok, I renamed the filed rmDocumentTitle to rmdocumenttitle and now the
field is there!
Is there some naming rules for the field's names? No uppercase?
Greetings
Francesco
-Original Message-
From: Croci Francesco Luigi (ID SWS) [mailto:fcr...@id.ethz.ch]
Sent: Donnerstag, 13. März
On 13 March 2014 18:33, Croci Francesco Luigi (ID SWS)
fcr...@id.ethz.ch wrote:
Ok, I renamed the filed rmDocumentTitle to rmdocumenttitle and now the
field is there!
Is there some naming rules for the field's names? No uppercase?
No. We have used mixed-case names in the past.
Are you
Yes, in my test class I always do server.deleteByQuery(*:*, 5); at first.
As you can see I have fullText and signatureField defined. And they are there.
The only difference is that they are not manually set.
Can it be, that if you use the literal.* parameter you have to use lowercase?
Regards
Ok. Maybe I found the problem:
in the solrconfig.xml I have str name=lowernamestrue/str
I set it to false and now rmDocumentTitle is there too...
Regards
Francesco
-Original Message-
From: Croci Francesco Luigi (ID SWS) [mailto:fcr...@id.ethz.ch]
Sent: Donnerstag, 13. März 2014 14:39
Hello Team,
I am trying to index meta data of html pages, my setup is Nutch 2.2.1 and
Solr 4.7.0
I can confirm Nutch is parsing meta tags and feed data to index on Solr.
But I am unable to see meta tags when I query data.
schema.xml configuration I've done,
To accept indexing meta tags I've
1- SOLR 4.6
2- We do but right now I am talking about plain keyword queries just sorted
by date. Once this is better will start looking into caches which we
already changed a little.
3- As I said the contents are not stored in this index. Some other metadata
fields are but with normal queries its
Hi;
I use Nutch and Solr to index meta tags. When you declare that:
dynamicField name=meta_* type=string stored=true indexed=true/
It should work. However I have a question. You have that field for copy:
metatag.keywords
but your dynamic field is
meta*_**
I mean it should have underscore
Hello,
how to get milliseconds result function in solr gives result in milliseconds
like
--7 result found in 0.00456 milliseconds.
Regards,
Kishan Parmar
Software Developer
+91 95 100 77394
Jay Shree Krishnaa !!
Hi Kishan,
Solr response already includes that info in QTime section. Aren't you seeing
it? If you don't see it try setting omitHeaders=false
On Thursday, March 13, 2014 6:12 PM, Kishan Parmar kishan@gmail.com wrote:
Hello,
how to get milliseconds result function in solr gives result
Hi,
Ups, I miswrote, it is omitHeader not omitHeaders
Please see : http://wiki.apache.org/solr/CommonQueryParameters#omitHeader
Ahmet
On Thursday, March 13, 2014 6:37 PM, Ahmet Arslan iori...@yahoo.com wrote:
Hi Kishan,
Solr response already includes that info in QTime section. Aren't you
On 3/13/2014 1:44 AM, Varun Rajput wrote:
I am using Solr 4.6.0 in cloud mode. The setup is of 4 shards, 1 on each
machine with a zookeeper quorum running on 3 other machines. The index size
on each shard is about 15GB. I noticed that the number of segments in
second shard was 42 and in the
Hi Furkan,
Thanka, I ve checked only with dynamic field as well, have you done any
other configuration changes to get it working?
Can you give me some of examples for your meta tags ex metatag.keywords ?
Tx,Shanaka
On Thursday, 13 March 2014, Furkan KAMACI furkankam...@gmail.com wrote:
Hi;
Hi Remi,
I read your post and like you, I have also identified that running solr
4.6.0 in cloud mode results in higher response time which has something to
do with merging of documents from the various shards.
Looking at the source code, we couldn't understand why it would take so much
time for
Hi;
When I check my documents I see an example: meta_keywords. It should
work. You may have a problem with Nutch side. Here is a link for it:
http://wiki.apache.org/nutch/IndexMetatags On the other hand dynamic fields
at Solr is explained here:
In case anyone else runs across this issue, I think we've found a
work-around.
We're seeing the same behavior with Solr 4.6.0 and 4.7. DataInputHandler
loads documents, but the updates to the replica fail because of the limited
support for the BigDecimal type in SolrCloud.
We've successfully
Hi Furkan,
sure, this is my data-config.xml:
dataConfig
document
entity name=item pk=id dataSource=store_db onError=skip
query=SELECT IT.* FROM item AS IT JOIN order AS ORD ON
IT.order_id=ORD.id WHERE (IT.status=1 AND ORD.status=1)
deltaQuery=SELECT IT.* FROM item IT,
Hey Shawn,
The config with the old policy used to be the literal name
mergeFactor. With TieredMergePolicy, there are now three settings
that must be changed in order to actually be the same as what
mergeFactor used to do.The followingconfig snippet is the equivalent
config to a mergeFactor
Here are some screen shots of our Solr Cloud cluster via Newrelic
http://postimg.org/gallery/2hyzyeyc/
We currently have a 5 node cluster and all indexing is done on separate
machines and shipped over. Our machines are running on SSD's with 18G of
ram (Index size is 8G). We only have 1 shard at
On 3/13/2014 12:54 PM, cpk wrote:
We're seeing the same behavior with Solr 4.6.0 and 4.7. DataInputHandler
loads documents, but the updates to the replica fail because of the limited
support for the BigDecimal type in SolrCloud.
We've successfully worked around the issue by setting
a little more information: it seems the issue is happening after we get
OutOfMemory error on facet query.
On Wed, Mar 12, 2014 at 11:06 PM, Avishai Ish-Shalom
avis...@fewbytes.comwrote:
Hi all!
After upgrading to Solr 4.6.1 we encountered a situation where a cluster
outage was traced to a
I think your response time is including the average response for an add
operation, which generally returns very quickly and due to sheer number are
averaging out the response time of your queries. New Relic should break
out requests based on which handler they're hitting but they don't seem to.
Hi,
Ralphs comment makes sense. We can confirm his explanation. What happens when
you select only QueryComponent and FacetComponent in first graph (requests
response time)?
On Friday, March 14, 2014 12:18 AM, ralph tice ralph.t...@gmail.com wrote:
I think your response time is including the
Hi,
I think NR has support for breaking by handler, no? Just checked - no.
Only webapp controller, but that doesn't apply to Solr.
SPM should be more helpful when it comes to monitoring Solr - you can
filter by host, handler, collection/core, etc. -- you can see the demo -
Ahh.. its including the add operation. That makes sense I then. A bit silly
on NR's part they don't break it down.
Otis, our index is only 8G so I don't consider that big by any means but
our queries can get a bit complex with a bit of faceting. Do you still
think it makes sense to shard? How
Hi,
I noticed the following post indicating that Solr could recover not-committed
data from operational log:
http://www.opensourceconnections.com/2013/04/25/understanding-solr-soft-commits-and-data-durability/
which contradicts with Solr's web site:
Skimmed this, but yes, docs are durable thanks to transaction log that can
replay on start.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Mar 13, 2014 8:25 PM, shushuai zhu ss...@yahoo.com wrote:
Hi,
I noticed the following post indicating that Solr could recover
not-committed
Hi,
Please update my account so I can edit the wiki https://wiki.apache.org/solr.
GregG / greg22...@yahoo.com
Specifically, I was installing Solr on Windows using Tomcat following the
instructions on https://wiki.apache.org/solr/SolrInstall, and had some issues
with the instructions and
Done, thanks! We can always use more editors who contribute their experiences...
On Thu, Mar 13, 2014 at 8:01 PM, Greg Gilles greggil...@yahoo.com wrote:
Hi,
Please update my account so I can edit the wiki https://wiki.apache.org/solr.
GregG / greg22...@yahoo.com
Specifically, I was
Any help on this is much appreciated. Is it better to use more cores for
zookeeper (as opposed to 1 core machine)?
On Wed, Mar 12, 2014 at 4:28 PM, Chris W chris1980@gmail.com wrote:
Hi Furkan
Load on the network is very low when read workload is on the cluster.
During indexing, a few
What about SEO? If somebody gives me Google Analytics access, I would
be happy to dig around that for a while to see if people can actually
find stuff on the Wiki.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is
or why haven't I thought of this before?
I'm once again being faced with the recurring problem of phrase
searches with wildcards. It'll lead to index bloat, but that's
acceptable in this situation, at least until proved not so.
The surround query parser can deal with wildcards and proximith, but
Different but (conceptually) similar?
http://robotlibrarian.billdueber.com/2012/03/boosting-on-exactish-anchored-phrase-matching-in-solr-sst-4/index.html
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the
Hi,
Is there any way to secure the solr index directory . I have many users on
a server and i want to restrict file access to only the administrator.
does securing the index directory affect solr accessing the folder
Thanks,
Prasi
On 3/13/2014 7:24 PM, Chris W wrote:
Any help on this is much appreciated. Is it better to use more cores for
zookeeper (as opposed to 1 core machine)?
I would guess that disk latency is the biggest bottleneck for zookeeper.
Unless the SolrCloud install is quite large, I don't think that much
It really depends, hard to give a definitive instruction without more
pieces of info.
e.g. if your CPUs are all maxed out and you already have a high number of
concurrent queries than sharding may not be of any help at all.
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr
56 matches
Mail list logo