[
https://issues.apache.org/jira/browse/SDAP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589172#comment-16589172
]
ASF GitHub Bot commented on SDAP-120:
-------------------------------------
lewismc commented on issue #32: SDAP-120 Error trying to ingest logs
URL:
https://github.com/apache/incubator-sdap-mudrod/pull/32#issuecomment-415118873
Hi @fgreg this issue has been resolved, we've addressed the regression in
the HTTPD server producing the logs which now means logs are written int he
combined log format.
New logs can be found at
ftp://podaac.jpl.nasa.gov/misc/outgoing/cjf/mudrod/2018/07/
...where the logs to be used are everything APART from the artifact
containing **old** in the name. From now on, all servers will be logging in the
combined log format.
I'm going to close this off for the time being, please re-open if you feel
this is an issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Error trying to ingest logs
> ---------------------------
>
> Key: SDAP-120
> URL: https://issues.apache.org/jira/browse/SDAP-120
> Project: Apache Science Data Analytics Platform
> Issue Type: Bug
> Components: mudrod
> Reporter: Frank Greguska
> Priority: Blocker
>
> Trying to ingest January 2018 logs results in error
>
> {quote}
> 2018-07-09 18:06:29,119 INFO server.Server (Server.java:doStart(379)) -
> Started @3794ms
> 2018-07-09 18:06:29,381 INFO handler.ContextHandler
> (ContextHandler.java:doStart(744)) - Started
> o.s.j.s.ServletContextHandler@11dcd42c{/metrics/json,null,AVAILABLE}
> 2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:<init>(51)) - Started Mudrod Weblog Discovery
> Engine.
> 2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:preprocess(98)) - Starting Web log preprocessing.
> 2018-07-09 18:06:29,875 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:preprocess(106)) - Processing logs dated 201801.gz
> 2018-07-09 18:06:30,013 INFO pre.ImportLogFile
> (ImportLogFile.java:execute(80)) - Starting Log Import 201801.gz
> 2018-07-09 18:06:31,084 INFO util.Version (Version.java:logVersion(108)) -
> Elasticsearch Hadoop v5.2.0 [d85a257f9f]
> 2018-07-09 18:06:31,451 INFO rdd.EsRDDWriter
> (RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.http]
> 2018-07-09 18:08:15,371 INFO rdd.EsRDDWriter
> (RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.ftp]
> 2018-07-09 18:13:15,916 INFO pre.ImportLogFile
> (ImportLogFile.java:execute(84)) - Log Import complete. Time elapsed 405
> seconds
> 2018-07-09 18:13:15,925 INFO pre.CrawlerDetection
> (CrawlerDetection.java:execute(82)) - Starting Crawler detection raw.http
> 2018-07-09 18:13:16,262 ERROR main.MudrodEngine (MudrodEngine.java:main(395))
> - Error whilst parsing command line.
> java.lang.IllegalArgumentException: [size] must be greater than 0. Found [0]
> in [Users]
> at
> org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder.size(TermsAggregationBuilder.java:148)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserTerms(LogAbstract.java:127)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserDocs(LogAbstract.java:135)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserRDD(LogAbstract.java:100)
> at
> org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.checkByRateInParallel(CrawlerDetection.java:112)
> at
> org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.execute(CrawlerDetection.java:85)
> at
> org.apache.sdap.mudrod.discoveryengine.WeblogDiscoveryEngine.preprocess(WeblogDiscoveryEngine.java:112)
> at
> org.apache.sdap.mudrod.main.MudrodEngine.startFullIngest(MudrodEngine.java:240)
> at org.apache.sdap.mudrod.main.MudrodEngine.main(MudrodEngine.java:385)
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)