[ 
http://jira.dspace.org/jira/browse/DS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11004#action_11004
 ] 

Stuart Lewis commented on DS-364:
---------------------------------

Hi Peter,

Thanks for trying the code - it is possibly the first time it has been tested 
with a large amount of production data.

But then tossed a million java errors:
solroutput.log.2009-06-15
Processing file: /dspace/log/solroutput.log.2009-06-15
java.net.SocketTimeoutException
        at org.xbill.DNS.Client.blockUntil(Client.java:43)
        at org.xbill.DNS.UDPClient.recv(UDPClient.java:43)
        at org.xbill.DNS.UDPClient.sendrecv(UDPClient.java:70)
        at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:256)
        at 
org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:93)
        at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:359)
        at org.dspace.statistics.util.DnsLookup.reverseDns(DnsLookup.java:36)
        at 
org.dspace.statistics.util.StatisticsImporter.load(StatisticsImporter.java:191)
        at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:75)

I'm wondering if it just skips over entries that cause this, or if some 
SocketTimeout constant needs to be bumped up?

I might add a new command line option to skip the DNS lookups as it will be 
quite slow performing so many thousands of reverse lookups. It also doesn't 
have a cache - I'll add one of those.

Here are a few lines from solroutput.log.2009.06-15
20090616000000288,view_item,6177,2009-06-16T00:00:00,anonymous,66.249.68.172
20090616000001115,view_item,38737,2009-06-16T00:00:01,anonymous,66.249.68.68
20090616000001939,view_bitstream,77155,2009-06-16T00:00:01,anonymous,66.249.68.68
20090616000003457,view_item,5127,2009-06-16T00:00:03,anonymous,66.249.68.172
20090616000004278,view_item,16976,2009-06-16T00:00:04,anonymous,66.249.68.172

(These are all from googlebot - we need to make sure they don't get counted - 
see http://jira.dspace.org/jira/browse/DS-440)

@stuartlewis There is a small typo in that comment (StatistcisImporter)

Which comment?

If anyone wants, I can email solroutput.log.2009.06-15 it is 2.2M. However my 
dspace.log.2009-06-15 is 30M.

That would be great! Would you be happy to attach it to the JIRA entry? (We'll 
need to write a little script to convert the handles into handles that exist on 
a test machine it is running with, as it has to look up the DSpace object from 
the handle.

Lastly, execution halted with this (machine has 4GB ram):
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.lang.Class.getDeclaredFields0(Native Method)
        at java.lang.Class.privateGetDeclaredFields(Class.java:2291)
        at java.lang.Class.getDeclaredField(Class.java:1880)
        at 
java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.<init>(AtomicReferenceFieldUpdater.java:181)
        at 
java.util.concurrent.atomic.AtomicReferenceFieldUpdater.newUpdater(AtomicReferenceFieldUpdater.java:65)
        at java.sql.SQLException.<clinit>(SQLException.java:353)
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1295)
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:188)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:354)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:258)
        at 
org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:93)
        at 
org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:93)
        at 
org.dspace.storage.rdbms.DatabaseManager.queryTable(DatabaseManager.java:239)
        at org.dspace.content.Item.retrieveMetadata(Item.java:202)
        at org.dspace.content.Item.<init>(Item.java:148)
        at org.dspace.content.Bundle.getItems(Bundle.java:358)
        at org.dspace.statistics.SolrLogger.storeParents(SolrLogger.java:295)
        at 
org.dspace.statistics.util.StatisticsImporter.load(StatisticsImporter.java:239)
        at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:75)

Maybe we need to call some context 'completes' from time to time to clear its 
cache? I'll look at that one too.

Many thanks for the feedback - it is invaluable for getting the code working 
well.

Thanks,


Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/


> Script to convert legacy dspace.log stats into solr stats records
> -----------------------------------------------------------------
>
>                 Key: DS-364
>                 URL: http://jira.dspace.org/jira/browse/DS-364
>             Project: DSpace 1.x
>          Issue Type: Sub-task
>            Reporter: Stuart Lewis
>            Assignee: Stuart Lewis
>             Fix For: 1.6.0
>
>         Attachments: [DS-364]-for-review.patch
>
>


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.dspace.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to