Re: Increase Physical Memory in Solr

2020-01-13 Thread Terry Steichen
Maybe solr isn't using enough of your available memory (a rough check is produced by 'solr status'). Do you realize you can start solr with a  '-m xx' parameter? (for me, xx = 1g) Terry On 1/13/20 3:12 PM, rhys J wrote: On Mon, Jan 13, 2020 at 3:11 PM Gael Jourdan-Weil <

Re: Newbie permissions problem running solr

2019-05-30 Thread Terry Steichen
For what it's worth - after not using it for some time, I just started up my solr system (6.6.0) and made a mistake in the command line.  I mistakenly used 'bin/solr start -c -m 1gb' and got precisely the same error message as Bernard did (other than the '.." part).  When I changed it to the

Re: Content from EML files indexing from text/html (which is not clean) instead of text/plain

2019-01-14 Thread Terry Steichen
Using 6.6.0, I am able to index EML files just fine.  The trick is, when indexing files containing .eml, add "-filetypes eml" to the commandline (note the plural filetypes). Terry Steichen On 1/13/19 10:18 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > I am using Solr 7

Re: How to access the Solr Admin GUI

2019-01-01 Thread Terry Steichen
I think a better approach to tunneling would be: ssh -p -L :localhost:8983 use...@myremoteserver.example.com This requires you to set up a different port () rather than use the standard 22 port (on your router and on your sshd config).  I've been running something like this for

Resolved Authorization Issue

2018-12-31 Thread Terry Steichen
Thanks, Dominique.  This appears to explain a LOT of past confusion. Terry On 12/31/18 5:26 AM, Dominique Bejean wrote: > So in Solr standalone mode, only authentication is fully functional, not > authorization !

Re: Basic Auth Permission

2018-12-08 Thread Terry Steichen
to retrieve.  And, depending on your system implementation, that information may be only available via a Solr search result (the access to which can be restricted). Terry Steichen On 12/8/18 12:06 AM, Noble Paul wrote: > You can't restrict access to static files. > > You can only restri

Re: Basic Auth Permission

2018-12-04 Thread Terry Steichen
I think there's been some confusion on which standalone versions support authentication.  I'm using 6.6 in cloud mode (purely so the authentication will work).  Some of the documentation seems to say that only cloud implementations support it, but others (like the experts on this forum) say that

Re: Basic Auth Permission

2018-12-04 Thread Terry Steichen
What Solr version are you using? On 12/4/18 2:47 PM, yydpkm wrote: > Thank you for your replay. I use your format and failed. User2 can still > visit collection "name" > Could that because I am using standalone Solr not Solrcloud? > > > > -- > Sent from:

Re: Basic Auth Permission

2018-12-04 Thread Terry Steichen
In setting his permission, Antony said he set "path": "/admin/file".  I use "path":"/*" - that may be too restrictive for you, but it works fine (for me). On 12/4/18 9:55 AM, yydpkm wrote: > Hi Antony, > > Have you solved this? I am facing the same thing. Other users can still do > /select after

RE: Solr OCR Support

2018-11-04 Thread Terry Steichen
+1 My experience is that you can't easily tell ahead of time whether your PDF is searchable or not. If it is, you may not even retrieve it because there's no text to index. Also, if you blindly OCR a file that has already been OCR'd, it can create a mess. Most higher end PDF editors have a

Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Terry Steichen
Erick, I don't get any such message when I start solr - could you share what that curl command should be? You suggest modifying solrconfig.xml - could you be more explicit on what changes to make? Terry On 10/11/2018 11:52 AM, Erick Erickson wrote: > bq: Also why solr updates and persists the

Re: Solr JVM Memory settings

2018-10-11 Thread Terry Steichen
Don't know if this directly affects what you're trying to do.  But I have an 8GB server and when I run "solr status" I can see what % of the automatic memory allocation is being used.  As it turned out, solr would occasionally exceed that (and crashed).  I then began starting solr with the

Re: Nutch+Solr

2018-10-03 Thread Terry Steichen
Bineesh, I don't use Nutch, so don't know if this is relevant, but I've had similar-sounding failures in doing and restoring backups.  The solution for me was to deactivate authentication while the backup was being done, and then activate it again afterwards.  Then everything was restored

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
they roughly bisect the problem. But other > things are important too. > > I hope this helps, > Alex. > > > On 26 September 2018 at 16:39, Terry Steichen wrote: >> Shawn, >> >> To the best of my knowledge, I'm not using SolrJ at all. Just >> Solr-

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
es? Hard to believe (but what is, is, I guess). Terry On 09/26/2018 03:49 PM, Shawn Heisey wrote: > On 9/26/2018 1:23 PM, Terry Steichen wrote: >> I'm pretty sure this was covered earlier.  But I can't find references >> to it.  The question is how to make indexing errors clear and

Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
or what might have caused the error.)  As I recall, Solr's post tool doesn't give any errors when indexing.  I (vaguely) recall that there's a way (through the logs?) to overcome this and show the errors.  Or maybe it's that you have to do the indexing outside of Solr? Terry Steichen

Re: copy field

2018-07-12 Thread Terry Steichen
uot;:0 ,"docs":[ { "id":"test2", "meta_creation_date":["2018-04-30T00:00:00Z"], " meta_creation_date_range":"2018-04-30T00:00:00Z", "_version_": 1603034044781559808}, { "id":"tes

Re: Regarding pdf indexing issue

2018-07-11 Thread Terry Steichen
Walter, Well said.  (And I love the hamburger conversion analogy - very apt.) The only thing I will add is that when you have a collection of similar rich text documents, you might be able to construct queries to respect internal structures within the documents.  If all/most of your documents

Re: Solr basic auth

2018-06-15 Thread Terry Steichen
"When authentication is enabled ALL requests must carry valid credentials."  I believe this behavior depends on the value you set for the *blockUnknown* authentication parameter. On 06/15/2018 06:25 AM, Jan Høydahl wrote: > When authentication is enabled ALL requests must carry valid

Re: Changing Field Assignments

2018-06-14 Thread Terry Steichen
ey seem to be making certain (very basic) assumptions that I'm unclear about, so your help in the preceding would be most appreciated. Thanks. Terry On 06/14/2018 01:51 PM, Shawn Heisey wrote: > On 6/11/2018 2:02 PM, Terry Steichen wrote: >> I am using Solr (6.6.0) in the automatic mo

Changing Field Assignments

2018-06-11 Thread Terry Steichen
I am using Solr (6.6.0) in the automatic mode (where it discovers fields).  It's working fine with one exception.  The problem is that Solr maps the discovered "meta_creation_date" is assigned the type TrieDateField.  Unfortunately, that type is limited in a number of ways (like sorting,

Date Query Confusion

2018-05-17 Thread Terry Steichen
To me, one of the more frustrating things I've encountered in Solr is working with date fields.  Supposedly, according to the documentation, this is straightforward.  But in my experience, it is anything but that.  In particular, I've found that the abbreviated forms of date queries, don't work as

Re: Techniques for Retrieving Hits

2018-05-14 Thread Terry Steichen
of actually locating and retrieving hitlist documents.  My way "seems" to work, and it is quite simple and compact.  I just threw it out seeking a sanity check from others. Terry On 05/14/2018 11:32 AM, Shawn Heisey wrote: > On 5/14/2018 6:46 AM, Terry Steichen wrote: >> In o

Techniques for Retrieving Hits

2018-05-14 Thread Terry Steichen
In order to allow users to retrieve the documents that match a query, I make use of the embedded Jetty container to provide file server functionality.  To make this happen, I provide a symbolic link between the actual document archive, and the Jetty file server.  This seems somewhat of a kludge,

Re: Specialized Solr Application

2018-04-19 Thread Terry Steichen
Thanks, Tim.  A couple of quick comments and a couple of questions: 1) the toughest pdfs to identify are those that are partly searchable (text) and partly not (image-based text).  However, I've found that such documents tend to exist in clusters. 2) email documents (.eml) are no

Re: Specialized Solr Application

2018-04-18 Thread Terry Steichen
OCR is particularly prone to > nonsense. PDFs can be tricky, > there's this spacing parameter that, depending on it's setting can > render e r i c k as 5 separate > letters or my name. > > Hey, you asked! Don't complain about long answers ;) > > Best, > Erick &g

Re: Specialized Solr Application

2018-04-17 Thread Terry Steichen
ge- > From: Charlie Hull [mailto:char...@flax.co.uk] > Sent: Tuesday, April 17, 2018 4:17 AM > To: solr-user@lucene.apache.org > Subject: Re: Specialized Solr Application > > On 16/04/2018 19:48, Terry Steichen wrote: >> I have from time-to-time posted questions to

Specialized Solr Application

2018-04-16 Thread Terry Steichen
lr list.  So, if you encounter problems peculiar to this kind of setup, we can perhaps help handle them off-list (although if they have more general Solr application, we should, of course, post them to the list). Terry Steichen

Re: [ANNOUNCE] Solr Reference Guide for Solr 7.3 released

2018-04-05 Thread Terry Steichen
and/or early release might be reflected back in the original change (11622). Anyway, I'm a happy camper now.  Thanks to all. On 04/05/2018 11:37 AM, Shawn Heisey wrote: > On 4/5/2018 9:05 AM, Terry Steichen wrote: >> I'm a bit confused because of the issue I was concerned about earlier:

Re: [ANNOUNCE] Solr Reference Guide for Solr 7.3 released

2018-04-05 Thread Terry Steichen
I'm a bit confused because of the issue I was concerned about earlier:  https://issues.apache.org/jira/browse/SOLR-11622 It was supposed to be fixed and included in (the then-future) 7.3, but I don't see it there in the listed 7.3.0 changes/bug-fixes. Am I missing something? On 04/05/2018 10:05

Re: Resetting Authentication/Authorization

2018-03-30 Thread Terry Steichen
On 03/29/2018 11:07 PM, Shawn Heisey wrote: > On 3/29/2018 8:28 PM, Terry Steichen wrote: >> When I set up the initial authentications and authorizations (I'm using >> 6.6.0 and running in cloud mode.), I call "bin/solr auth enable >> -credentials xxx:yyy". &g

Resetting Authentication/Authorization

2018-03-29 Thread Terry Steichen
When I set up the initial authentications and authorizations (I'm using 6.6.0 and running in cloud mode.), I call "bin/solr auth enable -credentials xxx:yyy".  I then use a series of additional API calls ( to create additional users and permissions).  This creates my desired security environment

Three Indexing Questions

2018-03-29 Thread Terry Steichen
First question: When indexing content in a directory, Solr's normal behavior is to recursively index all the files found in that directory and its subdirectories.  However, turns out that when the files are of the form *.eml (email), solr won't do that.  I can use a wildcard to get it to index the

Re: Continuing Saga of Authorization on 6.6.0

2018-03-13 Thread Terry Steichen
AM, Terry Steichen wrote: >> What also puzzles me is that I can't find any "security.json" file.  >> Clearly, solr is persistently keeping track of the >> authentication/authorization information, but I don't see where.  I >> suppose it might be kept in zookeeper (wh

Re: Continuing Saga of Authorization on 6.6.0

2018-03-13 Thread Terry Steichen
t I've not noticed any big > differences between the security for our 6.3 deployments and the 7.X ones. > > Best, > Chris > > On Tue, Mar 13, 2018 at 12:47 PM Terry Steichen <te...@net-frame.com> wrote: > >> I switched solr from standalone to cloud and created th

Continuing Saga of Authorization on 6.6.0

2018-03-13 Thread Terry Steichen
mission":{     "name":"collection-admin-edit",     "role":"admin"},   "errorMessages":["Unknown operation 'set-permission' "]}]} This really makes no sense at all (or, I'm really losing it - always a distinct possib

Resend: Authorization on 6.6.0

2018-03-12 Thread Terry Steichen
I'm resending the information below because the original message got the security.json stuff garbled. I'm using 6.6.0 with security.json active, having the content shown below.  I am running standalone mode, have two

Authorization in Solr 6.6.0 Not Working Properly

2018-03-12 Thread Terry Steichen
I'm using 6.6.0 with security.json active, having the content shown below.  I am running standalone mode, have two solr cores defined: email1, and email2.  Since the 'blockUnknown' is set to false, everyone should have access to any unprotected resource.  As you can see, I have three users

Setting Up Solr Authentication/Authorization

2018-03-09 Thread Terry Steichen
I'm trying to set up basic authentication/authorization with solr 6.6.0. The documentation says to create a security.json file and describes the content as: { "authentication":{ "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=

Re: Solr Read-Only?

2018-03-06 Thread Terry Steichen
directory R-O? Terry On 03/06/2018 04:20 PM, Christopher Schultz wrote: > Terry, > > On 3/6/18 4:08 PM, Terry Steichen wrote: > > Is it possible to run solr in a read-only directory? > > > I'm running it just fine on a ubuntu server which is accessible > &

Solr Read-Only?

2018-03-06 Thread Terry Steichen
Is it possible to run solr in a read-only directory? I'm running it just fine on a ubuntu server which is accessible only through SSH tunneling.  At the platform level, this is fine: only authorized users can access it (via a browser on their machine accessing a forwarded port).  The problem is

Re: Challenges of Indexing Email

2018-02-26 Thread Terry Steichen
ort for this > https://issues.apache.org/jira/browse/SOLR-11622 which is fixed for future > release. > > Before running into this issue we were running 6.4.2 which did not have > this bug. > > On Mon, Feb 26, 2018 at 9:59 AM, Terry Steichen <te...@net-frame.com> wrote: > &g

Challenges of Indexing Email

2018-02-26 Thread Terry Steichen
I am using Solr 7.2.1 and trying to index (among other documents) individual emails and collected email threats.  Ideally, the indexing would parse the email messages into their constituent fields.  But, for my purposes, an acceptable alternative is to merely index the messages a unstructured