Be sure to be sending plain text emails, not HTML, and watch out for
things that could be considered spam. Apache mail servers do receive a
LOT of spam, so need to have relatively aggressive spam filters in
place.
Upayavira
On Thu, Jul 23, 2015, at 07:29 PM, Steven White wrote:
Hi Everyone,
Well you've at least confirmed what I was thinking :).
I am using payloads now for this and I think I have something very basic
working. The results don't get dropped out when the scores are 0 so I had
to also write a custom collector that could be plugged into the
AnalyticQueryAPI (maybe there
On 7/23/2015 3:14 PM, Darin Amos wrote:
I have been trying to run the SOLR war with embedded Jetty and can’t seem to
get the config quiet right. Is there any known documentation on this or is
someone else doing this? I seem to just be setting up a document server at my
solr.home directory.
Given following Impala query:
SELECT date, SUM(CAST(price AS DOUBLE)) AS price
FROM table
WHERE date='2014-01-01' AND store_id IN(1,2,3)
GROUP BY date;
To work with Solr
1. Will it be more efficient to directly use equivalent Solr query? Any
curl command equivalent to the
Ah, now we're on to something! Solr 4.10.0 is also using the same
zookeepers, and both are using Oracle Java 8 JRE.
Did some research and uploaded a new config to zookeeper using chroot to
isolate them. Changed the init script to have
ZK_Host=zk1,zk2,zk3/DevConfigs. I did see that you should
Hello,
I have been trying to run the SOLR war with embedded Jetty and can’t seem to
get the config quiet right. Is there any known documentation on this or is
someone else doing this? I seem to just be setting up a document server at my
solr.home directory. The code snippet below seems
Hi Shawn,
Thanks for your help.
I settled on the following solution, that I am in the process of testing out:
entity name=LEAP_PARTY pk=LEAP_PARTY_ID
query=SELECT DISTINCT
'LEAP_PARTY' AS DOCUMENT_TYPE, VPARTY.OWNER AS PARTY_OWNER,
Yay!
On Thu, Jul 23, 2015, at 10:13 PM, Aaron Gibbons wrote:
Ah, now we're on to something! Solr 4.10.0 is also using the same
zookeepers, and both are using Oracle Java 8 JRE.
Did some research and uploaded a new config to zookeeper using chroot to
isolate them. Changed the init script to
sometimes when echoing back the whole thread it looks like spam
On Thu, Jul 23, 2015 at 1:42 PM, Steven White swhite4...@gmail.com wrote:
Three emails to the existing subject of Basic auth didn't make it. As
you may have seen, I started a new email thread on this subject under
Basic Auth
On 7/23/2015 10:55 AM, cbuxbaum wrote:
Say we have 100 party records. Then the child SQL will be run 100
times (once for each party record). Isn't there a way to just run the child
SQL on all of the party records at once with a join, using a GROUP BY and
ORDER BY on the PARTY_ID?
Hi Steve,
We've not yet moved to Solr 5, but we do use Jetty 9. In any case, Basic
Auth is a Jetty thing, not a Solr thing.
We do use this mechanism to great effect to secure things like index
writers and such, and it does work well once it's setup.
Jetty, as with all containers, is a bit fussy
Is there a way to patch? I am using 5.2.1 and using json facet in production.
On Jul 16, 2015, at 1:43 PM, Yonik Seeley ysee...@gmail.com wrote:
To anyone using the JSON Facet API in released Solr versions:
I discovered a serious memory leak while doing performance benchmarks
(see
There's a few odd things here, looking at your explains output:
First search, first result:
6.357613 = (MATCH) weight(description:jackshaft in 3339)
[DefaultSimilarity], result of:
6.357613 = fieldWeight in 3339, product of:
1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
Same issue really as long as there's more than one replica/shard.
Tied scores are broken by internal ID, specifying a
secondary sort should regularize things.
Best,
Erick
On Thu, Jul 23, 2015 at 10:01 AM, Tarala, Magesh mtar...@bh.com wrote:
Erick,
The 3 node cluster is setup to use 3 shards
On Thu, Jul 23, 2015 at 5:00 PM, Harry Yoo hyunat...@gmail.com wrote:
Is there a way to patch? I am using 5.2.1 and using json facet in production.
First you should see if your queries tickle the bug...
check the size of the filter cache from the admin screen (under
plugins, filterCache)
and see
I don't have this issue.
I have tried with various json facet queries and my filter cache always come
down to the 'minsize'( never exceeds configured) with solr version 5.2.1, and
all my queries are json nested faceted.
On 23-Jul-2015, at 7:43 pm, Yonik Seeley ysee...@gmail.com wrote:
On
You could try stopping SOLR, going into the data directory and rm -rf * and
starting SOLR again.
Did you use the schema REST api? Residual ?
On Thu, Jul 23, 2015 at 6:57 PM, Shamik Bandopadhyay sham...@gmail.com
wrote:
Hi,
I'm facing this weird error while running result grouping queries.
Hi,
I'm facing this weird error while running result grouping queries. This
started when I turned on docvalues for an existing facet field and
indexed the documents. Looking at the exception, I reverted back the change
and re-indexed the documents again. But I'm still getting the exception,
What it looks like is kinda as Erick suggested - the scores are the same
for some docs, so it probably depends upon which order they come back
from the shards as to which will be shown first.
If the issue is that the score is the same for some docs, try adding a
deliberate sort:
sort=score
I added the explicit sort:
http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaftfl=service_orderwt=jsonindent=truedebugQuery=truesort=score%20desc,id%20asc
Still seeing the same behavior - inconsistent results:
First time I run:
Hi,
I have implemented a new file-type parser for TIka. It parses a custom
filetype (*.mx)
I would like my Solr instance to use my version of Tika with the mx parser.
I found this by a google search
https://lucidworks.com/blog/extending-apache-tika-capabilities/
But it seems to be over 5
Hi Petter,
I'm on Solr 5.2.1 which comes with Jetty 9.2. I'm setting this up on
Windows 2012 but will need to do the same on Linux too.
I followed the step per this link:
https://wiki.apache.org/solr/SolrSecurity#Jetty_realm_example very much to
the book. Here are the changes I made:
File:
Three emails to the existing subject of Basic auth didn't make it. As
you may have seen, I started a new email thread on this subject under
Basic Auth (again) and now they are making it to the list.
I don't know what to make of this.
Steve
On Thu, Jul 23, 2015 at 4:31 PM, Upayavira
Sorry for being vague, I'll try to explain more. In my use case a
particular field does not have a security control, it's the data in the
field. So for instance if I had a schema with a field called name, there
could be data that should be secured at A, B, AB, A|B, etc within that
field. So
I've seen something like this on another system - where the OR is
consumed as a query term rather than an operator.
Remember that Edismax will use the Lucene query parser (which supports
OR, etc) unless there is an exception, and defer to dismax if there is a
syntax error.
What I'd suggest here
I'd still like to just confirm that you're using the same Java for
running Solr and for running bin/solr.
When you run bin/solr you are doing that on the instance itself?
You show a collections API URL below. Does that fail the same way?
Basically, the exception you showed was a SolrJ error.
Hi Upayavira - the URL was:
http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)wt=jsonindent=trueclustering=truerows=1df=FULL_DOCUMENTdebugQuery=true
Here is the relevant part of the response - notice that the default
field (FULL_DOCUMENT)
On 7/23/2015 7:51 AM, Joseph Obernberger wrote:
Hi Upayavira - the URL was:
http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)wt=jsonindent=trueclustering=truerows=1df=FULL_DOCUMENTdebugQuery=true
Here is the relevant part of the
A quick follow up, after finding and eliminating some code that was
generating multiple update requests per second, applying the CMS GC
tuning options, and upgrading to Java 8, we've not experienced a single
long term GC pause. The java 8 upgrade got rid of the final couple of
pauses during
markus,
the first idea that come to my mind is this :
1) you configure your schema, creating your field types, and if necessary
fields associated
2) you build an UpdateRequestProcessor that do a conditional check per
document, and create the proper fields starting from one input field .
In this
*When you run bin/solr you are doing that on the instance itself? *
Yes
*You show a collections API URL below. Does that fail the same way?*
Error from API:
50042java.io.InvalidClassException:
org.apache.solr.client.solrj.SolrResponse; local class incompatible: stream
classdesc serialVersionUID =
Hello - the title says it all. When indexing a document, we need to run one or
more additional filters depending on the value of a specific field. Likewise,
we need to run that same filter over the already analyzed tokens when querying.
This is not going to work if i extend TextField, at all.
I originally started using Ansible playbooks which did install the JDK
(with the same error), but have been doing manual installs to take Ansible
completely out of the equation.
Safari wasn't giving showing the XML response so I ran this in Chrome..
I have about 15K documents in a 3 node solr cluster. When I execute a simple
search, I get the results in different order every time I search. But the
number of records is the same. Here's the definition for the field.
Any ideas, suggestions would be greatly appreciated.
fieldType
Thanks for letting us know how it turned out. Too often I'm never sure
what actually _worked_
Erick
On Thu, Jul 23, 2015 at 8:56 AM, Jeremy Ashcraft jashcr...@edgate.com wrote:
A quick follow up, after finding and eliminating some code that was
generating multiple update requests per
Hi, We are trying to improve the performance of our data import. We tried
using the CachedSqlEntityProcessor implementation, but that is apparently
broken.
I am looking at the workaround described below:
https://issues.apache.org/jira/browse/SOLR-3857
Have you tried it with a JDK? I tend to use JDK rather than JRE, but
don't recall whether this is a specific requirement for Solr.
Can you show the URL you use for the API, and the JSON/XML response you
get? I wouldn't expect to see mention of solrj in the API because it
isn't used. Just for the
That worked for most of my attributes. I have only one issue to fix. How
would I convert boolean values to integers? For example:
doc
...
bool name=prfalse/bool
/doc
to
ID pr=0
/ID
Is that possible as well?
On that note, what version of XSLT should I assume SOLR supports?
I've mainly used Oracle Java 8, but tested 7 also. Typically I'll wipe the
machines and start from scratch before installing a different version. The
latest attempt followed these steps exactly on each machine:
- sudo apt-get install python-software-properties
- sudo add-apt-repository
Hmmm, what other Solr nodes do you have connected to Zookeeper? Are any
of them running a different Java or Solr version?
It looks like you have another node connected to your Zookeeper that has
taken the role of overseer and it is sending back serialized java
objects that your own node cannot
xsl:template match=doc
ID NewID=...
xsl:apply-templates select=pr/
/ID
/xsl:template
xsl:template match=bool[.='false']
xsl:attribute name={@name}0/xsl:attribute
/xsl:template
xsl:template match=bool[.='true']
xsl:attribute name={@name}1/xsl:attribute
/xsl:template
Note, if you find XSLT
I'm executing a very simple search in a 3 node cluster - 3 shards with 1
replica each. Solr version 4.10.2:
http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaftfl=service_orderwt=jsonindent=truedebugQuery=true
I'm getting different scores when I run
Hi Everyone,
I'm seeing that some of my emails are not making it to the mailing list and
I confirmed that I'm subscribed:
Hi! This is the ezmlm program. I'm managing the
solr-user@lucene.apache.org mailing list.
I'm working for my owner, who can be reached
at
(re-posting as new email thread to see if this will make it to the list)
That didn't help. I still get the same result and virtually no log to help
me figure out where / what things are going wrong.
Here is all that I see in C:\Solr\solr-5.2.1\server\logs\solr.log:
INFO - 2015-07-23
Hi Steve,
What version of Jetty are you using?
Have you got a webdefault.xml in your etc folder?
If so, does it have an entry like this:
login-config
auth-methodBASIC/auth-method
realm-nameRealm Name as specified in jetty.xml/realm-name
/login-config
It's been a few years since I
Hello,
I briefly described the similar problems at
http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html
Let me know if you have further questions
On Thu, Jul 23, 2015 at 7:55 PM, cbuxbaum cbuxb...@tradestonesoftware.com
wrote:
That's OK, I have determined that
bq: I do not understand why anyone would ever use facet.prefix or
facet.contains for any use other than a development...
Gotta disagree a bit here. AFAIK, it depends on the number of unique
terms in the field. How would either one be worse than facet.field?
And you can freely use facet.field on a
bq: Your ugly problem is my situation I think ;)
No, your problem is much worse ;(
The _contents_ of fields are restricted, which is
horrible.
OK, here's another idea out of waaay left field: Payloads.
It hinges on there being an OK number of possible combinations
which seems to be the
The query you're running would help. But here's a guess:
You say you have a 3 node Solr cluster. By that I'm
guessing you mean a single shard with 1 leader
and 2 replicas.
when the primary sort criteria (score by default) is tied
between two documents, the internal Lucene doc ID
is used as a
Your version of the config didn't come through, the
mail program is pretty aggressive about stripping attachments
and things.
Best,
Erick
On Thu, Jul 23, 2015 at 8:20 AM, cbuxbaum
cbuxb...@tradestonesoftware.com wrote:
Hi, We are trying to improve the performance of our data import. We tried
That's OK, I have determined that caching is not relevant to our use case.
However, I have a question about the full import queries that we are using:
Here is the SQL from the top level entity:
query=SELECT DISTINCT 'LEAP_PARTY' AS
DOCUMENT_TYPE, VPARTY.OWNER AS
Erick,
The 3 node cluster is setup to use 3 shards each with 1 replica. So, the index
is split on 3 servers.
Another piece of info - I think the issue happens only when I use pagination.
Verifying if that's the case..
Here's a query from the solr log on the server I'm pointing the query to:
Hi Dave and Markus,
I would definitely suggest to use the *Suggester Component* .
In particular, for your use case I suggest the AnalyzingInfixLookup
strategy .
As usual i suggest :
Erick's post - http://lucidworks.com/blog/solr-suggester/
My post -
Well, if you had a result say:
...
doc
str name=id589587B2B1CA4C4683FC106967E7C326/str
str name=arEE3YYK/str
int name=age31034/int
/doc
...
applying the template:
xsl:template match=doc
ID NewID={@id} ... /
/xsl:template
would result in the following XML:
IMAGES
ID NewID=/
/IMAGES
you are correct, I should have said:
xsl:template match=doc
ID NewID={str[@name='id']}.../
/xsl:template
On Thu, Jul 23, 2015, at 10:15 AM, Sreekant Sreedharan wrote:
Well, if you had a result say:
...
doc
str name=id589587B2B1CA4C4683FC106967E7C326/str
str name=arEE3YYK/str
int
Hello - You should index your terms as n-grams indeed, especially for
autocompletion. I do not understand why anyone would ever use facet.prefix or
facet.contains for any use other than a development tool. It won't perform on
any index larger than small.
Jan Høydahl has put up a thorough
56 matches
Mail list logo