I thought that was supposed to work, however, I end up using:
hostname:8080/solr/... where hostname is my local PC's hostname.
--
Peter Dietz
Systems Developer/Engineer
Ohio State University Libraries
On Fri, Jan 14, 2011 at 2:29 PM, George Stanley Kozak <[email protected]>wrote:
> Thank you, Peter:
>
>
>
> I did run into a problem executing the command that you posted, so I am
> wondering if I may have a configuration issue(?).
>
>
>
> If I attempt to execute: http://<my dspace
> server>/solr/statistics/select?q=type:0+OR+type:2+OR+type:3+OR+type:4<http://%3cmy%20dspace%20server%3e/solr/statistics/select?q=type:0+OR+type:2+OR+type:3+OR+type:4>
>
>
>
> I get the infamous: “*Access to the specified resource () has been
> forbidden*.”
>
>
>
> Should I be able to get into the solr stats this way?
>
>
>
> George Kozak
>
> Digital Library Specialist
>
> Cornell University Library Information Technologies (CUL-IT)
>
> 501 Olin Library
>
> Cornell University
>
> Ithaca, NY 14853
>
> 607-255-8924
>
>
>
> *From:* [email protected] [mailto:[email protected]] *On Behalf Of *Peter
> Dietz
> *Sent:* Friday, January 14, 2011 12:54 PM
> *To:* George Stanley Kozak
> *Cc:* [email protected]
> *Subject:* Re: [Dspace-tech] Solr Statistcs still not working
>
>
>
> Hi George,
>
>
>
> I'm hoping something comes out of this discussion too, as our solr instance
> is not running fast enough to query.
>
>
>
> You can also see if you can use the stats util to delete some of the bots
> in the logs. I've also noticed that the default spiders config doesn't
> include matches for msnbot/bingbot. So we might need to build something for
> a user-agent based search-and-destroy for those entries.
>
>
>
>
>
> You can get a count of how many records are in solr,
>
>
> http://localhost:8080/solr/statistics/select?q=type:0+OR+type:2+OR+type:3+OR+type:4
>
> Look at: <*result* name="response" numFound="9307348" start="0">
>
>
> I don't know if solr has a performance break-off, where after a certain
> number of documents the performance trails off, but I've noticed that giving
> more memory always helps.
>
>
>
> Our production server runs on a VM with 2.5GB of memory, and thats not
> enough for all of the existing webapps, and SOLR to work well. SOLR queries
> are abysmally slow, however the usage events are still being logged/inserted
> just fine.
>
>
>
> To test things out, I've copied our production /dspace/solr directory to my
> workstation which has 6GB of memory, and queries run much faster, as in they
> finish before a user turns the computer off. So the bitter pill could be
> that you either will want to boost your production server with more memory,
> or have a dedicated SOLR server.
>
>
>
> One thing I've noticed with regard to the file-size of the solr data files
> of the index is that there are some files which are really really big. In
> our instance we have.
>
> Size Filename
>
> 933M _3dd02.fdt
>
> 158M _3j75t.fdt
>
> 146M _3dd02.frq
>
> 136M _3dyoh.fdt
>
> 132M _3dtx2.fdt
>
> 132M _3dmlz.fdt
>
> 131M _3iyu2.fdt
>
> ...
>
>
>
> So it could be the pain that solr has to open an especially large 933MB
> file from disk, and load it into memory. On memory limited machines this can
> cause java out of memory errors, or just poor performance. I'm wondering if
> there is some performance enhancement that can go into our implementation of
> our Solr Client in dspace to create a more optimized index.
>
>
>
> Another approach could be to limit the data we store in solr. We currently
> store a bunch of things, and it all adds up, especially when you have 9
> million of them.
>
> Currently an example hit is:
>
> <*doc*>
>
> <*str* name="city">
>
> *Columbus*
>
> </*str*>
>
> <*str* name="continent">
>
> *NA*
>
> </*str*>
>
> <*str* name="countryCode">
>
> *US*
>
> </*str*>
>
> <*str* name="dns">
>
> *ack5859s3.lib.ohio-state.edu.*
>
> </*str*>
>
> <*int* name="id">
>
> *1256*
>
> </*int*>
>
> <*str* name="ip">
>
> *128.146.175.194*
>
> </*str*>
>
> <*bool* name="isBot">
>
> *false*
>
> </*bool*>
>
> <*float* name="latitude">
>
> *40.029007*
>
> </*float*>
>
> <*float* name="longitude">
>
> *-83.0809*
>
> </*float*>
>
> <*arr* name="owningComm">
>
> <*int*>
>
> *154*
>
> </*int*>
>
> <*int*>
>
> *19*
>
> </*int*>
>
> <*int*>
>
> *19*
>
> </*int*>
>
> </*arr*>
>
> <*date* name="time">
>
> *2010-06-14T16:39:33.586Z*
>
> </*date*>
>
> <*int* name="type">
>
> *3*
>
> </*int*>
>
> <*str* name="userAgent">
>
> *Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/533.4 (KHTML, like
> Gecko) Chrome/5.0.366.2 Safari/533.4*
>
> </*str*>
>
> </*doc*>
>
>
>
> Perhaps after a certain time period (6 months) we could have a
> super-optimize where we squash results per
> community/collection/item/bitstream down to monthly/daily result. So instead
> of determining that collection:1256 has 945 hits by finding all 945 records,
> but it might be more efficient to have to count 6 monthly aggregate records
> that have a value of a couple hundred hits. This approach would lose some of
> the fine-grained quality of the search results that SOLR gives us, but it
> would make the process much faster. For instance we could run a query that
> returned you all hits you've had from visitors running 64-bit linux, from
> the US and give me the result per hour.
>
>
>
> I do really like all the data in SOLR, especially when we upgrade our
> system to 1.7, which has discovery, for then we can combine statistics and
> real data for the more interesting queries.
>
> I hope we can make progress on our Solr implementation, however, I'm still
> looking at things like Elastic Search and services like Loggly for the time
> being.
>
>
>
> --
> Peter Dietz
> Systems Developer/Engineer
> Ohio State University Libraries
>
>
> On Fri, Jan 14, 2011 at 11:53 AM, George Stanley Kozak <[email protected]>
> wrote:
>
> Hi…
>
>
>
> Last month I wrote about a problem that I was having with the Solr
> Statistics (I am using DSpace 1.6.2). Starting in December, I was noticing
> that we were getting high CPU usage, and I learned that the statistics were
> not appearing for users. When they clicked the “View Statistics” button,
> the browser seemed to “churn” and the stats never appeared.
>
>
>
> After the advice that I received from various people (thank you, all), I
> added the Solr patch and made the changes the Solr Config file. I
> recompiled things and then ran the “dspace stats-util –optimize” step.
> However, what I now have is just a partial fix. I don’t have the out of
> control CPU usage, but the statistics never appear. If a user presses the
> “View Statistics” button, the browser just seems to “churn” and the
> statistics never appear.
>
>
>
> For the time being, I have temporarily removed the “View Statistics” button
> from community-home.jsp, collection-home.jsp and display-item.jsp.
>
>
>
> Outside what I have already done. Does anyone have any other suggestions?
> By the way, the Statistics work OK on my test system (which has a smaller
> database and less traffic).
>
>
>
> George Kozak
>
> Digital Library Specialist
>
> Cornell University Library Information Technologies (CUL-IT)
>
> 501 Olin Library
>
> Cornell University
>
> Ithaca, NY 14853
>
> 607-255-8924
>
>
>
>
>
> ------------------------------------------------------------------------------
> Protect Your Site and Customers from Malware Attacks
> Learn about various malware tactics and how to avoid them. Understand
> malware threats, the impact they can have on your business, and how you
> can protect your company and customers by using code signing.
> http://p.sf.net/sfu/oracle-sfdevnl
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
>
>
------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand
malware threats, the impact they can have on your business, and how you
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech