Hi Joel,
Right now, we are (web) crawling almost 85millions of documents and this
can increase to double. Collection is plainly divided into shards and so
while searching, its search across all shards.
If it is possible for a system to distributed documents into shards based
on documents similarit
A million of collections is rather drastic, but just as a basic
answer, you also have collection aliases (in SolrCloud mode):
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateormodifyanAliasforaCollection
You can also send request passing parameters in POST, r
Yes URL length is also one of my concerns. If, say, I have a million of
collections, must I specify all the collection names in the request to
perform a search across all collections? The reason I want to combine data
config into a single node is because I feel it is impractical to search
large amo
I think the easiest thing would be then to put 'q' in the invariant
part and use parameter substitution to get the user query.
Use either
https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries#LocalParametersinQueries-ParameterDereferencing
or
https://cwiki.apache.org/conflue
I believe the config request for DIH is read on every import, so it is
entirely possible to just have one handler and pass the parameter for
which specific file to use as the configuration.
It is also possible to actually pass the full configuration as a URL
parameter dataConfig. Need to watch out
Shawn, thank you. This was exactly what I was looking for.
I am already using SolrJ, so the follow two lines did the job:
ZkConfigManager configManager = new
ZkConfigManager(cloudSolrClient.getZkStateReader().getZkClient());
configManager.uploadConfigDir(Paths.get(configPath), configName);
Tha
On 4/6/2016 3:26 PM, Don Bosco Durai wrote:
> I want to automate the entire process from my Java process which is not
> running on any of the servers were SolrCloud is running. In short, I don’t
> have access to bin/solr or server/scripts/cloud-scripts, etc from my
> application. So I was wonder
Hmmm...Not sure I understand, but it sounds like you've found the best
solution for the limitations you're experiencing...
On Wed, Apr 6, 2016 at 4:38 PM, Don Bosco Durai wrote:
> My challenge is, the server where my application is running doesn’t have
> Solr bits installed.
>
>
>
>
> Right now
My challenge is, the server where my application is running doesn’t have Solr
bits installed.
Right now I am asking users to install (just unzip) solr on any server and I
give them a shell script to run the script from command line before starting my
application. It is inconvenient, so I wa
Adding Yonik,
I almost implemented custom aggregate function using new facet API but
later on got runtime exceptions as "FacetContext" is not public. so looks
like Facet api components can't be created as external plugins.
I am successful using AnalyticsQueryAPI to perform what I want.
Yonik can
Therefore, this becomes possible:
http://stackoverflow.com/questions/525212/how-to-run-unix-shell-script-from-java-code
Hackish, but certainly doable... Given there's no API...
On Wed, Apr 6, 2016 at 3:44 PM, John Bickerstaff
wrote:
> Yup - just tested - that command runs fine with Solr NOT ru
Yup - just tested - that command runs fine with Solr NOT running...
On Wed, Apr 6, 2016 at 3:41 PM, John Bickerstaff
wrote:
> If you can get to the IP addresses from your application, then there's
> probably a way... Do you mean you're firewalled off or in some other way
> unable to access the
If you can get to the IP addresses from your application, then there's
probably a way... Do you mean you're firewalled off or in some other way
unable to access the Solr box IP's from your Java application?
If you're looking to do "automated build of virtual machines" there are
some tools like Va
Right... You can store that anywhere - but at least consider not storing
it in your existing SOLR collection just because it's there... It's not
really the same kind of data -- it's application meta-data and/or
user-specific data...
Getting it out later will be more difficult than if you store i
I have SolrCloud pre-installed. I need to create a collection, but before that
I need to load the config into zookeeper.
I want to automate the entire process from my Java process which is not running
on any of the servers were SolrCloud is running. In short, I don’t have access
to bin/solr or
That's more of an app-level feature, there's nothing in Solr that does
this for you.
Some people have used a different Solr collection to store the queries
as strings for display, but that's again something you build on top of
Solr, not a core feature.
Best,
Erick
On Wed, Apr 6, 2016 at 2:32 AM,
you can mitigate the impact of throwing away caches on soft commits by
doing appropriate autowarming, both the newSearcher and cache settings
in solrconfig.xml.
Be aware that you don't want to go overboard here, I'd start with 20
or so as the autowarm counts for queryResultCache and filterCache.
As of Solr 5.5 the bin/solr script can do this, see:
https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
It's still not quite what you're looking for, but uploading arbitrary
xml scripts through a browser is a security issue, so it's possible
there will never be an API cal
As of now, there's no way to do so. There were some efforts on those lines but
it's been on hold.
-Anshum
> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai wrote:
>
> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh -zkhost
> $zk_host -cmd upconfig -confdir $config_folder -confna
On 4/6/2016 11:07 AM, shamik wrote:
> Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
> parser work, you need to know the document id. I'm just curious as why this
> constraint has been added. This will not work for a bulk of use cases. For
> e.g. if we are trying to gene
Hi Alessandro,
Thanks for replying!
Here are my answers inline.
1. "First of all, simple string autosuggestion or document autosuggestion ?
(
with more additional field to show then the label)
Document autosuggestions
2. Are you interested in the analysis for the text to suggest ? Fuzzy
s
Hi Alessandro,
Thanks for replying!
Here are my answers inline.
On Mon, Apr 4, 2016 at 6:34 PM, Alessandro Benedetti
wrote:
> Hi Chandan,
> I will answer as my previous answer to a similar topic that got lost :
> "First of all, simple string autosuggestion or document autosuggestion ? (
> w
Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh -zkhost
$zk_host -cmd upconfig -confdir $config_folder -confname $config_name using
APIs?
I want to bootstrap by uploading the configs via API. Once the configs are
uploaded, I am now able to do everything else via API.
Thanks
B
I don't know of any contrib or module that does this. Can you describe why
you'd want to route documents to shards based on similarity? What
advantages would you get by using this approach?
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Apr 6, 2016 at 1:36 PM, davidphilip cherian <
davidphi
Any thoughts?
On Tue, Apr 5, 2016 at 9:05 PM, davidphilip cherian <
davidphilipcher...@gmail.com> wrote:
> Hi,
>
> Is there any contribution(open source contrib module) that routes
> documents to shards based on document similarity technique? Or any
> suggestions that integrates mahout to solr f
Why not copy the field values of category, title, features, spec into a
common text field and then search on that field. Otherwise use a edismax
query parser and search with user search string on all the above fields may
be by boosting title, category and specs field in order to get relevant
result
Please note the exact description of hte property on the URL you
mentioned..
"The TZ parameter can be specified to override the default TimeZone (UTC)
used for the purposes of adding and rounding in date math"
The newer ref guide docs for this param also explain...
https://cwiki.apache.or
Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
parser work, you need to know the document id. I'm just curious as why this
constraint has been added. This will not work for a bulk of use cases. For
e.g. if we are trying to generate MLT based on a text or a keyword, how
w
I think that's how I would approach it. I used command-line instead of
rest api to create collection, but I think that just generates rest api
command via curl... so that will be no different as far as I can tell - I'm
just more comfortable on the command line.
Step 8 is the thing I'm not sure ab
Hi John , Shawn
Thanks for replying to my query . Really appreciate your responses
Ideally I’d like to do node by node rolling upgrade from 4.4 to 5.5
But gave this approach of rolling upgrade because I faced issue with SolrJ
4.4 client connecting with 5.5 cluster or 5.5 solar client connecting
Wait a second, and let's avoid any confusion.
We can have different input for a More Like This Request Handler ( if this
is what you were using).
1) the Id of the document we want to find similar documents to
2) a bunch of text
Then you have a lot of parameters that will affect the MLT core.
Spec
I'll agree with Shawn too - munging Zookeeper by hand can lead to VERY
unexpected results...
My recommendation would be to start fresh with a new 5.x setup and a new
/chroot in Zookeeper.
(This can be deleted and recreated repeatedly if necessary - I know because
I did... a lot... before I got it
I recently upgraded from 4.x to 5.5 -- it was a pain to figure it out, but
it turns out to be fairly straightforward...
Caveat: Because I run all my data into Kafka first, I was able to easily
re-create my collections by running a microservice that pulls from Kafka
and dumps into Solr.
I have a r
I'm bordering on development post, but I want to write an Authentication Plugin
that uses Proxy Authentication and a White List.
So, it will accept a request header such as REMOTE_USER as the username from
certain hosts, by default 127.0.0.1, ::1.
I also thought about having a whitelist of IPs th
I haven't traced through all the code recently, so I can't dispute Jan if
he knows a place that checks the output of the pf phrase analysis to see if
it is a single term, but... the INPUT to pf is definitely multiple clauses.
Regardless of the use of the keyword tokenizer, the query parser sees two
On 4/6/2016 7:13 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
> Ah, thanks. It never occurred to me that clicking on the text "Create" would
> give me a different result compared to clicking on the arrow. In my mind,
> "Create" was simply the label, and the arrow indicating a dropdown option fo
I suppose q= is singular param doesn't accept multiple values.
On Wed, Apr 6, 2016 at 1:01 PM, Anand Chandrashekar
wrote:
> Greetings.
>
> 1) A join query creates an array of "q" parameter. For example, the query
>
>
> http://localhost:8983/solr/gettingstarted/select?q=projectdata%3A%22top+secre
On Wednesday, April 6, 2016 2:50 PM, apa...@elyograg.org wrote:
>
> If you can only create a service desk request, then you might be clicking the
> "Service Desk" menu item,
> or maybe you're clicking the little down arrow on the right side of the big
> red "Create" button.
> Try clicking the
Thanks.
I googled to look for examples of how to proceed, and notice that you opened
SOLR-8951
Thanks again
-Original Message-
From: Jan Høydahl [mailto:jan@cominvent.com]
Sent: Wednesday, April 06, 2016 4:18 AM
To: solr-user@lucene.apache.org
Subject: Re: BYOPW in security.json
H
On 4/5/2016 3:08 PM, Anuj Lal wrote:
> I am new to solr. Need some advice from more experienced solr team members
>
> I am upgrading 4.4 solr cluster to 5.5
>
> One of the step I am doing for upgrade is to bootstrap from existing 4.4 solr
> home ( after upgrading solr installation to 5.5)
We'll
On 4/6/2016 2:35 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
> I guess I can conclude that this is a bug. But I wasn't able to report it in
> Jira. I just got to some servicedesk form
> (https://issues.apache.org/jira/servicedesk/customer/portal/5/create/27) that
> didn't seem related to solr
Anand,
have a look at the example schema, there is a section that explains
"invariants" which could be one solution to your question.
-Stefan
On Wed, Apr 6, 2016 at 12:01 PM, Anand Chandrashekar
wrote:
> Greetings.
>
> 1) A join query creates an array of "q" parameter. For example, the query
>
> Oh, hang on... If a phrase is defined as multiple tokens, and pf is used for
> phrase boosting, does that mean that even with a regular tokenizer the pf
> won't work for fields that only contain one word? For example if the title of
> one document is "John", and the user searches for 'John' (
On Wed, Apr 6, 2016 at 7:53 AM, Robert Brown wrote:
> The QTime's are from the updates.
>
> We don't have the resource right now to switch to SolrJ, but I would
> assume only sending updates to the leaders would take some redirects out of
> the process,
How do you route your documents now ?
Aren
At the moment, the tz parameter will be used to calculate the UTC date in
the query, based on the tz supplied.
In the index the dates are in UTC.
To show the dates in the same timezone we query, we should implement a
DocTransformer[1] .
This DocTransformer will check for all ( or a subset) of date
Greetings.
1) A join query creates an array of "q" parameter. For example, the query
http://localhost:8983/solr/gettingstarted/select?q=projectdata%3A%22top+secret+data2%22&q=%7B!join+from=locatorByUser+to=locator%7Dusers=joe
creates the following array elements for the "q" parameter.
[array en
I understand. Would be nice though :)
Thanks.
On 04/06/2016 11:26 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
I think that this parameter is only used to interpret the dates provided in the
query, like query filters. At least that is how I interpret the wiki text. Your
interpretation makes
I understand.
Although I am not exactly sure how to solve this one, this should serve as
a helpful starting point:
https://lucidworks.com/resources/webinars/natural-language-search-with-solr/
On Wed, 6 Apr 2016, 11:27 Midas A, wrote:
> thanks Binoy for replying ,
>
> i am giving you few use case
Hi,
I have designed one web page on which user can search and filter his data
based on some term facets. I am using Apache Solr 5.3.1 for the same. It is
working perfectly fine.
Now my requirement is to save the query which I have executed on Solr, so,
in future, if I need to search the same resu
I think that this parameter is only used to interpret the dates provided in the
query, like query filters. At least that is how I interpret the wiki text. Your
interpretation makes more sense in general though, it would be nice if it was
possible to modify the timezone for both the query and the
OK, well I'm not sure I agree with you. First of all, you ask me to point my
"pf" towards a tokenized field, but I already do that (the fact that all text
is tokenized into a single token doesn't change that fact). Also, I don't agree
with the view that a single term phrase never is valid/reason
Hi,
According to the wiki
https://wiki.apache.org/solr/CoreQueryParameters#TZ I can use the TZ
param to specify the timezone.
I tried to make a query and put in the raw section TZ=Europe/Berlin or
any other found in
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones but no
luck. Th
Hi,
Phrase match via “pf” requires the target field to contain a phrase. A phrase
is defined as multiple tokens. Yours does not contain a phrase since you use
the KeywordTokenizer, leaving only one token in the field. eDismax pf will thus
never kick in. Please point your “pf” towards a tokenize
I guess I can conclude that this is a bug. But I wasn't able to report it in
Jira. I just got to some servicedesk form
(https://issues.apache.org/jira/servicedesk/customer/portal/5/create/27) that
didn't seem related to solr in any way, (the affects/fix version fields didn't
correspond to any s
Hi
Note that storing the user names and passwords in security.json is just one
implementation, to easily get started. It uses the Sha256AuthenticationProvider
class, which is pluggable. That means that if you require Basic Auth with some
form of self-service management, you could/should add ano
55 matches
Mail list logo