Re: Starting Solr automatically

2019-12-16 Thread Paras Lehana
Hi Anuj,

Firstly, you should be checking into the logs for the reason of Solr
getting stopped. We had started Solr since a year ago and it's still up. I
guess OOM in your case.

Secondly, there are many ways to restart solr. For example, if it's
registered as a service, make a cron to restart solr whenever it's not
running.

But as I have said before, please look for the reason of solr getting
stopped.

On Tue, 17 Dec 2019 at 10:18, Anuj Bhargava  wrote:

> Often solr stops working. We have to then go to the root directory and give
> the command *'service solr start*'
>
> Is there a way to automatically start solr when it stops.
>
> Regards,
> Anuj
>
> >
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 


Re: updating documents via csv

2019-12-16 Thread Paras Lehana
Hi Rhys,

I use CDATA for XMLs:

   
 

There should be a similar solution for JSON though I couldn't find the
specific one on the internet. If you are okay to use XMLs for indexing, you
can use this.

On Tue, 17 Dec 2019 at 01:40, rhys J  wrote:

> Is there a way to update documents already stored in the solr cores via
> csv?
>
> The reason I am asking is because I am running into a problem with updating
> via script with single quotes embedded into the field itself.
>
> Example:
>
> curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
> "356767", "name1": {"set": "NORTH AMERICAN INT'L"},"name2": {"set": " "}}]'
>
> I have tried the following as well:
>
> curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
> "356767", "name1": {"set": "NORTH AMERICAN INT\'L"},"name2": {"set": "
> "}}]'
>
> curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
> "356767", "name1": {"set": "NORTH AMERICAN INT\\'L"},"name2": {"set": "
> "}}]'
>
> curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ \\"id\\":
> \\"356767\\", \\"name1\\": {\\"set\\": \\"NORTH AMERICAN INT\\'L\\"},}]'
>
> All of these break on the single quote embedded in field name1.
>
> Does anyone have any ideas as to what I can do to get around this?
>
> I will also eventually need to get around having an & inside a field too,
> but that hasn't come up yet.
>
> Thanks,
>
> Rhys
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 


Starting Solr automatically

2019-12-16 Thread Anuj Bhargava
Often solr stops working. We have to then go to the root directory and give
the command *'service solr start*'

Is there a way to automatically start solr when it stops.

Regards,
Anuj

>


updating documents via csv

2019-12-16 Thread rhys J
Is there a way to update documents already stored in the solr cores via csv?

The reason I am asking is because I am running into a problem with updating
via script with single quotes embedded into the field itself.

Example:

curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
"356767", "name1": {"set": "NORTH AMERICAN INT'L"},"name2": {"set": " "}}]'

I have tried the following as well:

curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
"356767", "name1": {"set": "NORTH AMERICAN INT\'L"},"name2": {"set": " "}}]'

curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ "id":
"356767", "name1": {"set": "NORTH AMERICAN INT\\'L"},"name2": {"set": "
"}}]'

curl http://localhost:8983/solr/dbtr/update?commit=true -d '[{ \\"id\\":
\\"356767\\", \\"name1\\": {\\"set\\": \\"NORTH AMERICAN INT\\'L\\"},}]'

All of these break on the single quote embedded in field name1.

Does anyone have any ideas as to what I can do to get around this?

I will also eventually need to get around having an & inside a field too,
but that hasn't come up yet.

Thanks,

Rhys


Re: need for re-indexing when using managed schema

2019-12-16 Thread Erick Erickson
That’s a little overstated, a full explanation of what’s safe and what’s not is 
several pages and depends on what you mean by “safe”.

Any modification to a schema, even if they don’t cause something to outright 
break, may leave the index in an inconsistent state. For instance, remember 
that Lucene and Solr really don’t care if doc1 doesn’t have a particular field 
X and doc2 does. If you do something as “safe” as add a new field, only 
documents indexed after that change will have the field. Your index will 
continue to function with no errors in that case, but any searches on the new 
field won’t return any docs indexed before the change until the older docs are 
re-indexed.

So you can see where this is going. “If you add a field _and then reindex all 
your documents_, it’s perfectly safe. However, between the time you add the 
field and the re-indexing is complete, you results may be inconsistent.

On the other hand,  if you change, say, a DocValues field from 
multValued="true" to multiValued=“false” the results are undefined _even if you 
reindex all your docs_.

On the other, other hand, if you delete a field, the meta-data is still in your 
index, the only way to get rid of it is to delete your index and re-index or 
index to a new collection and searches may return docs on the deleted field if 
it was created with a dynamic field definition that’s still in the schema”.

On the other, other, other hand… the list goes on and on.

So since even something as non-breaking as adding a new field requires you to 
re-index all your older docs anyway to get back to a consistent state, so it’s 
just easiest to plan on re-indexing all your docs whenever you change the 
schema. And, I’d also advise, index to a new collection…

Best,
Erick

> On Dec 16, 2019, at 12:57 PM, Joseph Lorenzini  wrote:
> 
> Hi all,
> 
> I have question about the managed schema functionality.  According to the
> docs, "All changes to a collection’s schema require reindexing". This would
> imply that if you use a managed schema and you use the schema API to update
> the schema, then doing a full re-index is necessary each time.
> 
> Is this accurate or can a full re-index be avoided?
> 
> Thanks,
> Joe



Re: backing up and restoring

2019-12-16 Thread rhys J
On Mon, Dec 16, 2019 at 1:42 AM Paras Lehana 
wrote:

> Looks like a write lock. Did reloading the core fix that? I guess it would
> have been fixed by now. I guess you had run the delete query few moments
> after restoring, no?
>
>
Restoring setting the name parameter only worked the once.

This is my workaround:

run backup command

Delete documents.

Stop solr
Start solr

delete the segment and write.lock files by name.

Copy over the index files from the snapshot to the data/index folder

Start solr

Verify presence of documents via search for *:*

I know it's not pretty, but I have found it works every time.

Thanks,

Rhys


Re: [EXTERNAL] Re: Autoscaling simulation error

2019-12-16 Thread Cao, Li
Hi Andrzej ,

I have put the JSONs produced by "save" commands below:

autoscalingState.json - https://pastebin.com/CrR0TdLf
clusterState.json - https://pastebin.com/zxuYAMux
nodeState.json https://pastebin.com/hxqjVUfV
statistics.json https://pastebin.com/Jkaw8Y3j

The simulate command is:
/opt/solr-8.3.0/bin/solr autoscaling -a policy2.json -simulate  -zkHost 
rexcloud-swoods-zookeeper-headless:2181

Policy2 can be found here:
https://pastebin.com/VriJ27DE

Setup:
12 nodes on Kubernetes. 6 for TLOG and 6 for Pull. The simulation is run on one 
of nodes inside Kubernetes because it needs the zookeeper inside the Kubernetes.

Thanks!

Li


On 12/15/19, 5:13 PM, "Andrzej Białecki"  wrote:

Could you please provide the exact command-line? It would also help if you 
could provide an autoscaling snapshot of the cluster (bin/solr autoscaling 
-save ) or at least the autoscaling diagnostic info.

(Please note that the mailing list removes all attachments, so just provide 
a link to the snapshot).


> On 15 Dec 2019, at 18:42, Cao, Li  wrote:
>
> Hi!
>
> I am using solr 8.3.0 in cloud mode. I have collection level autoscaling 
policy and the collection name is “entity”. But when I run autoscaling 
simulation all the steps failed with this message:
>
>"error":{
>  "exception":"java.io.IOException: 
java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: 
org.apache.solr.common.SolrException: Could not find collection : 
entity/shards",
>  "suggestion":{
>"type":"repair",
>"operation":{
>  "method":"POST",
>  "path":"/c/entity/shards",
>  "command":{"add-replica":{
>  "shard":"shard2",
>  "node":"my_node:8983_solr",
>  "type":"TLOG",
>  "replicaInfo":null}}},
>
> Does anyone know how to fix this? Is this a bug?
>
> Thanks!
>
> Li




Re: unable to update using empty strings or 'null' in value

2019-12-16 Thread rhys J
On Mon, Dec 16, 2019 at 2:51 AM Paras Lehana 
wrote:

> Hey Rhys,
>
>
> Short Answer: Try using "set": null and not "set": "null".
>
>
Thank you, this worked!

Rhys


need for re-indexing when using managed schema

2019-12-16 Thread Joseph Lorenzini
Hi all,

I have question about the managed schema functionality.  According to the
docs, "All changes to a collection’s schema require reindexing". This would
imply that if you use a managed schema and you use the schema API to update
the schema, then doing a full re-index is necessary each time.

Is this accurate or can a full re-index be avoided?

Thanks,
Joe


SolrTextTagger with multiple fields

2019-12-16 Thread Atita Arora
Hi,


I went through the SolrTextTagger in Solr, more than it sounds interesting,
I am wondering what are the implications of using multiple tag fields?

The idea is to identify different types of fields in the user query and use
them as filters.
Can anyone direct me to some examples?
Can we include the comma-separated list in the field param of the
requesthandler?

Thank you ,
Atita