Query function error - can not use FieldCache on multivalued field

2020-09-14 Thread Shamik Bandopadhyay
int on this function, can this boost be done in a different way? Any pointers will be appreciated. Thanks, Shamik

Lemmatizer for Solr

2020-02-14 Thread Shamik Bandopadhyay
to our organization (not sure of kstem filter can do that). Any pointers will be appreciated. Regards, Shamik

Lemmatizer for indexing

2019-10-14 Thread Shamik Bandopadhyay
er can do that). Any pointers will be appreciated. Regards, Shamik

Fwd: Numeric value ignored by EdgeNGramFilterFactory

2019-07-04 Thread Shamik Bandopadhyay
ppreciated. Thanks, Shamik

Numeric value ignored by EdgeNGramFilterFactory

2019-07-04 Thread Shamik Bandopadhyay
72 is ignored and what'll be the best way to address this scenario? Any pointers will be appreciated. Thanks, Shamik

Numeric value ignored by EdgeNGramFilterFactory

2019-07-04 Thread Shamik Bandopadhyay
ppreciated. Thanks, Shamik

Re: Problem with white space or special characters in function queries

2019-03-28 Thread shamik
Thanks Jan, I was not aware of this, appreciate your help. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Problem with white space or special characters in function queries

2019-03-28 Thread shamik
Ahemad, I don't think its related to the field definition, rather looks like an inherent bug. For the time being, I created a copyfield which uses a custom regex to remove whitespace and special characters and use it in the function. I'll debug the source code and confirm if it's bug, will raise a

Re: Problem with white space or special characters in function queries

2019-03-27 Thread shamik
I'm using Solr 7.5, here's the query: q=line=language:"english"=Source2:("topicarticles"+OR+"sfdcarticles")=url,title=ADSKFeature:"CUI+(Command)"^7=recip(ms(NOW/DAY,PublishDate),3.16e-11,1,1)^2+if(termfreq(ADSKFeature,'CUI (Command)'),log(CaseCount),sqrt(CaseCount))=10 -- Sent from:

Re: Problem with white space or special characters in function queries

2019-03-26 Thread shamik
Edwin, The field is a string type, here's the field definition. -Shamik -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Problem with white space or special characters in function queries

2019-03-25 Thread Shamik Bandopadhyay
iated. Thanks, Shamik

Re: Solr recovery issue in 7.5

2018-12-17 Thread shamik
I'm still pretty clueless trying to find the root cause of this behavior. One thing is pretty consistent that whenever a node restarts up and sends a recovery command, the recipient shard/replica goes down due to sudden surge in old gen heap space. Within minutes, it hits the ceiling and stall the

Re: Solr recovery issue in 7.5

2018-12-14 Thread shamik
Thanks Eric. I guess I was not clear when I mentioned that I had stopped the indexing process. It was just a temporary step to make sure that we are not adding any new data when the nodes are in a recovery mode. The 10 minute hard commit is carried over from our 6.5 configuration which actually

Re: Solr recovery issue in 7.5

2018-12-12 Thread shamik
Erick, Thanks for your input. All our fields (for facet, group & sort) have docvalues enabled since 6.5. That includes the id field. Here's the field cache entry: CACHE.core.fieldCache.entries_count:0 CACHE.core.fieldCache.total_size: 0 bytes Based on whatever I've seen so far,

Solr recovery issue in 7.5

2018-12-12 Thread Shamik Bandopadhyay
t cache has gone up (0.61). It used to be 0.9 and 0.3 in Solr 6.5. Not sure what we are missing here in terms of Solr upgrade to 7.5 I can provide other relevant information. Thanks, Shamik

Re: Does ConcurrentUpdateSolrClient apply for SolrCloud ?

2018-10-24 Thread shamik
Thanks Erick, appreciate your help -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Does ConcurrentUpdateSolrClient apply for SolrCloud ?

2018-10-24 Thread shamik
Thanks Erick, that's extremely insightful. I'm not using batching and that's the reason I was exploring ConcurrentUpdateSolrClient. Currently, N threads are reusing the same CloudSolrClient to send data to Solr. Ofcourse, the single point of failure was my biggest concern with

Does ConcurrentUpdateSolrClient apply for SolrCloud ?

2018-10-24 Thread Shamik Bandopadhyay
Hi, I'm looking into the possibility of using ConcurrentUpdateSolrClient for indexing a large volume of data instead of CloudSolrClient. Having an async,batch API seems to be a better fit for us where we tend to index a lot of data periodically. As I'm looking into the API, I'm wonderign if

Re: Multiple Queries per request

2018-10-02 Thread Shamik Sinha
The Solr uses REST based calls which is done over http or https which cannot handle multiple requests at one shot. However what you can do is return all the necessary data at one shot and group them according to your needs. Thanks and regards, Shamik On 02-Oct-2018 8:11 PM, "Greenhorn T

Re: Regarding pdf indexing issue

2018-07-11 Thread Shamik Sinha
You may try to use tesseract tool to check data extraction from pdf or images and then go forward accordingly. As far as I understand the PDF is an image and not data. The searchable PDF actually overlays the selectable text as hidden text over the PDF image. These PDFs can be indexed and

Error using multiple terms in function query

2018-05-15 Thread Shamik Bandopadhyay
Hi, I'm having issues using multiple terms in Solr function queries. For e.g. I'm trying to use the following bf function using termfreq bf=if(termfreq(ProductLine,'Test Product'),5,0) This throws org.apache.solr.search.SyntaxError: Missing end to unquoted value starting at 28

Re: Text in images are not extracted and indexed to content

2018-04-10 Thread Shamik Sinha
To index text in images the image needs to be searchable i. e. text needs to be overlayed on the image like a searchable pdf. You can do this using ocr but it is a bit unreliable if the images are scanned copies of written text. On 10-Apr-2018 4:12 PM, "Rahul Singh"

Re: Error when indexing with SolrJ HTTP ERROR 405

2018-03-19 Thread Shamik Sinha
tools. Check the url for the same. Then based on your requirement decide whether to use dih or oob indexing Thanks and regards, Shamik On Mon 19 Mar, 2018, 1:02 PM Khalid Moustapha Askia, < m.askiakha...@gmail.com> wrote: > Hi. I am trying to index some data with Solr by using SolrJ. B

Sol rCloud collection design considerations / best practice

2017-11-13 Thread Shamik Bandopadhyay
xceed more than 500k documents. Any pointers will be appreciated. Thanks, Shamik

Re: Solr nodes going into recovery mode and eventually failing

2017-10-23 Thread shamik
Thanks Emir and Zisis. I added the maxRamMB for filterCache and reduced the size. I could the benefit immediately, the hit ratio went to 0.97. Here's the configuration: It seemed to be stable for few days, the cache hits and jvm pool utilization seemed to be well within expected range. But

Re: Solr nodes going into recovery mode and eventually failing

2017-10-20 Thread shamik
Zisis, thanks for chiming in. This is really an interesting information and probably in line what I'm trying to fix. In my case, the facet fields are certainly not high cardinal ones. Most of them have a finite set of data, the max being 200 (though it has a low usage percentage). Earlier I had

Re: Solr nodes going into recovery mode and eventually failing

2017-10-20 Thread shamik
Thanks Eric, in my case, each replica is running on it's own JVM, so even if we consider 8gb of filter cache, it still has 27gb to play with. Isn't this is a decent amount of memory to handle the rest of the JVM operation? Here's an example of implicit filters that get applied to almost all the

Re: Solr nodes going into recovery mode and eventually failing

2017-10-19 Thread shamik
Thanks Emir. The index is equally split between the two shards, each having approx 35gb. The total number of documents is around 11 million which should be distributed equally among the two shards. So, each core should take 3gb of the heap for a full cache. Not sure I get the "multiply it by

Solr nodes going into recovery mode and eventually failing

2017-10-18 Thread Shamik Bandopadhyay
rt the instance, it goes into recovery mode and updates it's index with the delta, which is understandable.But at the same time, the other replica in the same shard stalls and goes offline. This starts a cascading effect and I've to end up restarting all the nodes. Any pointers will be appreciated. Thanks, Shamik

Authentication error : request has come without principal. failed permission

2017-10-02 Thread Shamik Bandopadhyay
Hi, I'm seeing this random Authentication failure in our Solr Cloud cluster which is eventually rendering the nodes in "down" state. This doesn't seem to have a pattern, just starts to happen out of the blue. I've 2 shards, each having two replicas. They are using Solr basic authentication

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-22 Thread shamik
Susheel, my inference was based on the Qtime value from Solr log and not based on application log. Before the CPU spike, the query time didn’t give any indication that they are slow in the process of slowing down. As the GC suddenly triggers a high CPU usage, query execution slows down or chocks,

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-22 Thread shamik
I usually log queries that took more than 1sec. Based on the logs, I haven't seen anything alarming or surge in terms of slow queries, especially around the time when the CPU spike happened. I don't necessarily have the data for deep paging, but the usage of sort parameter (date in our case) has

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-22 Thread shamik
All the tuning and scaling down of memory seemed to be stable for a couple of days but then came down due to a huge spike in CPU usage, contributed by G1 Old Generation GC. I'm really puzzled why the instances are suddenly behaving like this. It's not that a sudden surge of load contributed to

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread shamik
Emir, after digging deeper into the logs (using new relic/solr admin) during the outage, it looks like a combination of query load and indexing process triggered it. Based on the earlier pattern, memory would tend to increase at a steady pace, but then surge all of a sudden, triggering OOM. After

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread shamik
Thanks, the change seemed to have addressed the memory issue (so far), but on the contrary, the GC chocked the CPUs stalling everything. The CPU utilization across the cluster clocked close to 400%, literally stalling everything.On a first look, the G1-Old generation looks to be the culprit that

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
I agree, should have made it clear in my initial post. The reason I thought it's little trivial since the newly introduced collection has only few hundred documents and is not being used in search yet. Neither it's being indexed at a regular interval. The cache parameters are kept to a minimum as

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Walter, thanks again. Here's some information on the index and search feature. The index size is close to 25gb, with 20 million documents. it has two collections, one being introduced with 6.6 upgrade. The primary collection carries the bulk of the index, newly formed one being aimed at getting

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Thanks for your suggesting, I'm going to tune it and bring it down. It just happened to carry over from 5.5 settings. Based on Walter's suggestion, I'm going to reduce the heap size and see if it addresses the problem. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Apologies, 290gb was a typo on my end, it should read 29gb instead. I started with my 5.5 configurations of limiting the RAM to 15gb. But it started going down once it reached the 15gb ceiling. I tried bumping it up to 29gb since memory seemed to stabilize at 22gb after running for few hours, of

Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Shamik Bandopadhyay
. Does 6.6 command more memory than what is currently available on our servers (30gb)? What might be the probable cause for this sort of scenario? What are the best practices to troubleshoot such issues? Any pointers will be appreciated. Thanks, Shamik

Help with Query/Function for conditional boost

2017-08-16 Thread Shamik Bandopadhyay
for reference to show what I'm trying to achieve. Any pointers will be helpful. Thanks, Shamik

Re: Issues trying to boost phrase containing stop word

2017-07-20 Thread shamik
Any suggestion? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-trying-to-boost-phrase-containing-stop-word-tp4346860p4347068.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issues trying to boost phrase containing stop word

2017-07-19 Thread shamik
Hi Koji, I'm using a copy field to preserve the original term with stopword. It's mapped to titleExact. textExact definition:

Re: Issues trying to boost phrase containing stop word

2017-07-19 Thread shamik
Thanks Koji, I've tried KeywordRepeatFilterFactory which keeps the original term, but the Stopword filter in the analysis chain will remove it nonetheless. That's why I thought of creating a separate field devoiding of stopwords/stemmers. Let me know if I'm missing something here. -- View this

Issues trying to boost phrase containing stop word

2017-07-19 Thread Shamik Bandopadhyay
reciate if you someone can provide pointers. If there's a different approach to solving this issue, please let me know. Thanks, Shamik

Re: How to combine third party search data as top results ?

2017-02-06 Thread shamik
Charlie, this looks something very close to what I'm looking for. Just wondering if you've made this available as a jar or can be build from source? Our Solr distribution is not built from source, I can only use an external jar. I'll appreciate if you can let me know. -- View this message in

Re: How to combine third party search data as top results ?

2017-02-01 Thread shamik
Charlie, thanks for sharing the information. I'm going to take a look and get back to you. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-combine-third-party-search-data-as-top-results-tp4318116p4318349.html Sent from the Solr - User mailing list archive at

Re: How to combine third party search data as top results ?

2017-01-31 Thread shamik
Thanks, John. The title is not unique, so I can't really rely on it. Also, keeping an external mapping for url and id might not feasible as we are talking about possibly millions of documents. URLs are unique in our case, unfortunately, it can't be used as part of Query elevation component since

How to combine third party search data as top results ?

2017-01-31 Thread Shamik Bandopadhyay
a way to combine step 2 and 3 in a single query or a different approach altogether? Any pointers will be appreciated. -Thanks, Shamik

Re: Information on classifier based key word suggestion

2017-01-23 Thread shamik
Anyone ? -- View this message in context: http://lucene.472066.n3.nabble.com/Information-on-classifier-based-key-word-suggestion-tp4314942p4315492.html Sent from the Solr - User mailing list archive at Nabble.com.

Information on classifier based key word suggestion

2017-01-19 Thread Shamik Bandopadhyay
to be limited only providing taxonomy data which needs to be provided as a flat text. Few people suggested using classifiers like Naive Bayes classifier or other machine learning tools. I'll appreciate if anyone can provide some direction in this regard. Thanks, Shamik

Re: How to support facet values in search term

2016-11-22 Thread shamik
Thanks for the pointer Alex . I'll go through all four articles, thanksgiving will be fun :-) -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-include-facet-fields-in-keyword-search-tp4306967p4307020.html Sent from the Solr - User mailing list archive at Nabble.com.

How to support facet values in search term

2016-11-22 Thread Shamik Bandopadhyay
xt, title, keyword, etc), it's not returning any data. We've a large set of facet fields, I would ideally like to avoid adding them as part of the searchable list. Just wondering if there's a better way to handle this situation. Any pointers will be appreciated. Thanks, Shamik

Re: SolrJ doesn't work with Json facet api

2016-10-05 Thread shamik
You can try something like : query.add("json.facet", your_json_facet_query); -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-doesn-t-work-with-Json-facet-api-tp4299867p4299888.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: how to remove duplicate from search result

2016-09-27 Thread shamik
Did you take a look at Collapsin Query Parser ? https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-remove-duplicate-from-search-result-tp4298272p4298305.html Sent from the Solr - User mailing

Re: How to retrieve parent documents without a nested structure (block-join)

2016-09-27 Thread shamik
Thanks again Alex. I should have clarified the use of browse request handler. The reason I'm simulating the request handler parameters of my production system using browse. I used a separate request handler, stripped down all properties to match "select". I finally narrowed down the issue to

Re: How to retrieve parent documents without a nested structure (block-join)

2016-09-27 Thread shamik
Sorry to bump this up, but can someone please explain the parsing behaviour of a join query (show above) in respect to different request handler ? -- View this message in context:

Re: How to retrieve parent documents without a nested structure (block-join)

2016-09-26 Thread shamik
Thanks Alex, this has been extremely helpful. There's one doubt though. The query returns expected result if I use "select" or "query" request handler, but fails for others. Here's the debug output from "/select" using edismax.

Re: How to retrieve parent documents without a nested structure (block-join)

2016-09-25 Thread shamik
Thanks for getting back on this. I was trying to formulate a query in similar lines but not able to construct it (multiple clauses) correctly so far. That can be attributed to my inexperience with Solr queries as well. Can you please point to any documentation / example for my reference ?

Re: How to retrieve parent documents without a nested structure (block-join)

2016-09-25 Thread shamik
Thanks Alex. With the conventional join query I'm able to return the parent document based on a query match on the child. But, it filters out any other documents which are outside the scope of join condition. For e.g. in my case, I would expect the query to return : 1 Parent title

How to retrieve parent documents without a nested structure (block-join)

2016-09-22 Thread Shamik Bandopadhyay
appreciated. Thanks, Shamik

Re: Inventor-template vs Inventor template - issue with hyphen

2016-08-26 Thread shamik
Anyone ? -- View this message in context: http://lucene.472066.n3.nabble.com/Inventor-template-vs-Inventor-template-issue-with-hyphen-tp4293357p4293489.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Inventor-template vs Inventor template - issue with hyphen

2016-08-25 Thread shamik
icles^9.0 Source2:downloads^5.0 1.0/(3.16E-11*float(ms(const(147216960),date(PublishDate)))+1.0) The part I'm confused is why the two queries are being interpreted differently ? Thanks, Shamik -- View this message in context: http://lucene.472066.n3.nabble.com/Inventor-template-vs-Inven

Inventor-template vs Inventor template - issue with hyphen

2016-08-25 Thread Shamik Bandopadhyay
642158 text:inventor 0.098686054 Source2:CloudHelp 0.009136423 1.0/(3.16E-11*float(ms(const(147208320),date(PublishDate)))+1.0) I'm using edismax. Just wondering what I'm missing here. Any help will be appreciated. Regards, Shamik

Re: [ANN] Relevant Search by Manning out! (Thanks Solr community!)

2016-06-23 Thread shamik
Thanks for all the pointers. With 50% discount, picking a copy is a no-brainer -- View this message in context: http://lucene.472066.n3.nabble.com/ANN-Relevant-Search-by-Manning-out-Thanks-Solr-community-tp4283667p4284107.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: [ANN] Relevant Search by Manning out! (Thanks Solr community!)

2016-06-23 Thread shamik
Hi Doug, Congratulations on the release, I guess, lot of us have been eagerly waiting for this. Just one quick clarification. You mentioned that the examples in your book are executed against elasticsearch. For someone familiar with Solr, will it be an issue to run those examples in a Solr

Re: Multiple context field / filters in Solr suggester

2016-06-22 Thread shamik
Anyone ? -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-context-field-filters-in-Solr-suggester-tp4283739p4283894.html Sent from the Solr - User mailing list archive at Nabble.com.

Multiple context field / filters in Solr suggester

2016-06-21 Thread Shamik Bandopadhyay
y pointers will be appreciated. Thanks, Shamik

Re: Solrj Basic Authentication randomly failing - "request has come without principal"

2016-05-18 Thread shamik
anyone ? -- View this message in context: http://lucene.472066.n3.nabble.com/Solrj-Basic-Authentication-randomly-failing-request-has-come-without-principal-tp4277342p4277533.html Sent from the Solr - User mailing list archive at Nabble.com.

Solrj Basic Authentication randomly failing - request has come without principal

2016-05-17 Thread Shamik Bandopadhyay
Hi, I'm facing this issue where SolrJ calls are randomly failing on basic authentication. Here's exception: ERROR923629[qtp466002798-20] - org.apache.solr.security.PKIAuthenticationPlugin.doAuthenticate(PKIAuthenticationPlugin.java:125) - Invalid key INFO923630[qtp466002798-20] -

set-property API doesn't work for security.json authentication

2016-05-12 Thread Shamik Bandopadhyay
Hi, I'm trying to update the set-property option in security.json authentication section. As per the documentation, "Set arbitrary properties for authentication plugin. The only supported property is 'blockUnknown'" https://cwiki.apache.org/confluence/display/solr/Basic+Authentication+Plugin

Re: Solrj API with Basic Authentication

2016-05-11 Thread shamik
Ok, I found another way of doing it which will preserve the QueryResponse object. I've used DefaultHttpClient, set the credentials and finally passed it as a constructor to the CloudSolrClient. *DefaultHttpClient httpclient = new DefaultHttpClient(); UsernamePasswordCredentials defaultcreds = new

Solrj API with Basic Authentication

2016-05-11 Thread Shamik Bandopadhyay
onse or UpdateResponse objects instead. Any pointers will be appreciated. -Thanks, Shamik

Re: Issues with Authentication / Role based authorization

2016-05-11 Thread shamik
Brian, Thanks for your reply. My first post was bit convoluted, tried to explain the issue in the subsequent post. Here's a security JSON. I've solr and beehive assigned the admin role which allows them to have access to "update" and "read". This works as expected. I add a new role "browseRole"

Re: Issues with Authentication / Role based authorization

2016-05-11 Thread shamik
Anyone ? -- View this message in context: http://lucene.472066.n3.nabble.com/Issues-with-Authentication-Role-based-authorization-tp4276024p4276153.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issues with Authentication / Role based authorization

2016-05-11 Thread shamik
"v": 2 } } } And authorization: { "responseHeader": { "status": 0, "QTime": 0 }, "authorization.enabled": true, "authorization": { "class": "solr.RuleBasedAuthorizationPlugin", "user-role": { "solr": "admin", "superuser": [ "browseRole", "selectRole" ], "beehive": [ "browseRole", "selectRole" ] }, "permissions": [ { "name": "security-edit", "role": "admin" }, { "name": "select", "collection": "gettingstarted", "path": "/select/*", "role": "selectRole" }, { "name": "browse", "collection": "gettingstarted", "path": "/browse", "role": "browseRole" } ], "": { "v": 7 } } } I was under the impression that these roles are independent of each other, based on the assignment, individual user should be able to access their respective areas. On a related note, I was not able to make roles like "all", "read" work. Not sure what I'm doing wrong here. Any feedback will be appreciated. Thanks, Shamik -- View this message in context: http://lucene.472066.n3.nabble.com/Issues-with-Authentication-Role-based-authorization-tp4276024p4276056.html Sent from the Solr - User mailing list archive at Nabble.com.

How do we generate SHA256 password for Authentication

2016-05-10 Thread Shamik Bandopadhyay
mode "solr start -e cloud -noprompt" 2. zkcli.bat -zkhost localhost:9983 -cmd putfile /security.json security.json 3. tried http://localhost:8983/solr/gettingstarted/browse , provided dev/password but I'm getting the following exception: [c:gettingstarted s:shard2 r:core_node3 x:gettingstarted_shard2_replica2] org.apache.solr.servlet.HttpSolrCall; USER_REQUIRED auth header Basic c29scjpTb2xyUm9ja3M= context : userPrincipal: [[principal: solr]] type: [UNKNOWN], collections: [gettingstarted,], Path: [/browse] path : /browse params : Looks like I'm using the wrong way of generating the password. solr/SolrRocks works as expected. Also, sure what's wrong with the "readRole" . It doesn't seem to work when I try with user "solr". Any pointers will be appreciated. -Thanks, Shamik

Return only parent on child query match (w/o block-join)

2016-04-19 Thread Shamik Bandopadhyay
the reverse support through ChildDocTransformerFactory Just wondering if there's a way to address query in a different way. Any pointers will be appreciated. -Thanks, Shamik

Re: MLT Query Parser

2016-04-07 Thread shamik
Thanks Shawn and Alessandro. I get the part why id is needed. I was trying to compare with the "mlt" request handler which doesn't enforce such constraint. My previous example of title/keyword is not the right one, but I do have fields which are unique to each document and can be used as a key to

Re: MLT Query Parser

2016-04-06 Thread shamik
Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query parser work, you need to know the document id. I'm just curious as why this constraint has been added. This will not work for a bulk of use cases. For e.g. if we are trying to generate MLT based on a text or a keyword, how

MLT Query Parser

2016-04-05 Thread Shamik Bandopadhyay
mlt documents based on a "keyword" field. With the new query parser,I'm not able to see a way to use another field except for id. Is this a constraint? Or there's a different syntax? Any pointers will be appreciated. Thanks, Shamik

Solr 5.5 error at startup - ClassNotFoundException: org.simpleframework.xml.core.Persister

2016-03-19 Thread Shamik Bandopadhyay
y pointers will be appreciated. -Thanks, Shamik

Error starting solr 5.5 - Cannot open solr.log:No such file or directory

2016-03-19 Thread Shamik Bandopadhyay
preferIPv4Stack=true -Dlog4j.configuration=file:/mnt/ebs2/solrhome/log4j.properties -Dsolr.autoCommit.maxTime=6 -Dsolr.clustering.enabled=true Not sure what's going wrong. Any pointers will be appreciated. -Thanks, Shamik

Re: Solr Cloud sharding strategy

2016-03-07 Thread shamik
Thanks Eric and Walter, this is extremely insightful. One last followup question on composite routing. I'm trying to have a better understanding of index distribution. If I use language as a prefix, SolrCloud guarantees that same language content will be routed to the same shard. What I'm curious

Re: Solr Cloud sharding strategy

2016-03-07 Thread shamik
Thanks a lot, Erick. You are right, it's a tad small with around 20 million documents, but the growth projection around 50 million in next 6-8 months. It'll continue to grow, but maybe not at the same rate. From the index size point of view, the size can grow up to half a TB from its current

Solr Cloud sharding strategy

2016-03-07 Thread Shamik Bandopadhyay
collection for English and one for rest of the languages. Any pointers on this will be highly appreciated. Regards, Shamik

Re: understand scoring

2016-03-01 Thread shamik
Doug, do we've a date for the hard copy launch? -- View this message in context: http://lucene.472066.n3.nabble.com/understand-scoring-tp4260837p4260860.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: docValues error

2016-02-28 Thread shamik
David, this is tad weird. I've seen this error if you turn on docvalues for an existing field. You can running an "optimize" on your index and see if it helps. -- View this message in context: http://lucene.472066.n3.nabble.com/docValues-error-tp4260408p4260455.html Sent from the Solr - User

Re: Query time de-boost

2016-02-28 Thread shamik
I tried the function query route, but getting a weird exception. *bf=if(termfreq(ContentGroup,'Developer Doc'),-20,0)* throws an exception *org.apache.solr.search.SyntaxError: Missing end quote for string at pos 29 str='if(termfreq(ContentGroup,'Developer'* . Does it only accept single word or

Re: Query time de-boost

2016-02-26 Thread shamik
Thanks Walter, I've tried this earlier and it works. But the problem in my case is that I've boosting on few Source parameters as well. My ideal "bq" should like this: *bq=Source:simplecontent^10 Source:Help^20 (*:* -ContentGroup-local:("Developer"))^99* But this is not going to work. I'm

Re: Query time de-boost

2016-02-25 Thread shamik
Emir, I don't Solr supports a negative boosting *^-99* syntax like this. I can certainly do something like: bq=(*:* -ContetGroup:"Developer's Documentation")^99 , but then I can't have my other bq parameters. This doesn't work --> bq=Source:simplecontent^10 Source:Help^20 (*:*

Re: Query time de-boost

2016-02-24 Thread shamik
Binoy, 0.1 is still a positive boost. With title getting the highest weight, this won't make any difference. I've tried this as well. -- View this message in context: http://lucene.472066.n3.nabble.com/Query-time-de-boost-tp4259309p4259552.html Sent from the Solr - User mailing list archive at

Re: Query time de-boost

2016-02-24 Thread shamik
Hi Emir, I've a bunch of contentgroup values, so boosting them individually is cumbersome. I've boosting on query fields qf=text^6 title^15 IndexTerm^8 and bq=Source:simplecontent^10 Source:Help^20 (-ContentGroup-local:("Developer"))^99 I was hoping

Query time de-boost

2016-02-23 Thread Shamik Bandopadhyay
es.graphics". The boost on title pushes these documents at the top. What I'm looking is to see if there's a way deboost all documents that are tagged with ContentGroup:"Developer" irrespective of the term occurrence is text or title. Any pointers will be appreciated. Thanks, Shamik

Re: Question on index time de-duplication

2015-11-01 Thread shamik
That's what I observed as well. Perhaps there's a way to customize SignatureUpdateProcessorFactory to support my use case. I'll look into the source code and figure if there's a way to do it. -- View this message in context:

RE: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks Markus. I've been using field collapsing till now but the performance constraint is forcing me to think about index time de-duplication. I've been using a composite router to make sure that duplicate documents are routed to the same shard. Won't that work for SignatureUpdateProcessorFactory

Re: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks Scott. I could directly use field collapsing on adskdedup field without the signature field. Problem with field collapsing is the performance overhead. It slows down the query to 10 folds. CollapsingQParserPlugin is a better option, unfortunately, it doesn't support ngroups equivalent,

Re: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks for your reply. Have you customized SignatureUpdateProcessorFactory or are you using the configuration out of the box ? I know it works for simple dedup, but my requirement is tad different as I need to tag an identifier to the latest document. My goal is to understand if that's possible

Question on index time de-duplication

2015-10-29 Thread Shamik Bandopadhyay
wondering if this is achievable by perhaps extending UpdateRequestProcessorFactory or customizing SignatureUpdateProcessorFactory ? Any pointers will be appreciated. Regards, Shamik

Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-01 Thread shamik
Hi Kevin, Were you able to get a workaround / fix for your problem ? I'm also looking to secure Collection and Update APIs by upgrading to 5.3. Just wondering if it's worth the upgrade or should I wait for the next version, which will probably address this. Regards, Shamik -- View

  1   2   3   >