[
https://issues.apache.org/jira/browse/SDAP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank Greguska updated SDAP-120:
--------------------------------
Description:
Trying to ingest January 2018 logs results in error
{quote}
2018-07-09 18:06:29,119 INFO server.Server (Server.java:doStart(379)) -
Started @3794ms
2018-07-09 18:06:29,381 INFO handler.ContextHandler
(ContextHandler.java:doStart(744)) - Started
o.s.j.s.ServletContextHandler@11dcd42c{/metrics/json,null,AVAILABLE}
2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
(WeblogDiscoveryEngine.java:<init>(51)) - Started Mudrod Weblog Discovery
Engine.
2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
(WeblogDiscoveryEngine.java:preprocess(98)) - Starting Web log preprocessing.
2018-07-09 18:06:29,875 INFO discoveryengine.WeblogDiscoveryEngine
(WeblogDiscoveryEngine.java:preprocess(106)) - Processing logs dated 201801.gz
2018-07-09 18:06:30,013 INFO pre.ImportLogFile
(ImportLogFile.java:execute(80)) - Starting Log Import 201801.gz
2018-07-09 18:06:31,084 INFO util.Version (Version.java:logVersion(108)) -
Elasticsearch Hadoop v5.2.0 [d85a257f9f]
2018-07-09 18:06:31,451 INFO rdd.EsRDDWriter
(RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.http]
2018-07-09 18:08:15,371 INFO rdd.EsRDDWriter
(RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.ftp]
2018-07-09 18:13:15,916 INFO pre.ImportLogFile
(ImportLogFile.java:execute(84)) - Log Import complete. Time elapsed 405 seconds
2018-07-09 18:13:15,925 INFO pre.CrawlerDetection
(CrawlerDetection.java:execute(82)) - Starting Crawler detection raw.http
java.lang.IllegalArgumentException: [size] must be greater than 0. Found [0] in
[Users]
at
org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder.size(TermsAggregationBuilder.java:148)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserTerms(LogAbstract.java:127)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserDocs(LogAbstract.java:135)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserRDD(LogAbstract.java:100)
at
org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.checkByRateInParallel(CrawlerDetection.java:112)
at
org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.execute(CrawlerDetection.java:85)
at
org.apache.sdap.mudrod.discoveryengine.WeblogDiscoveryEngine.preprocess(WeblogDiscoveryEngine.java:112)
at
org.apache.sdap.mudrod.main.MudrodEngine.startFullIngest(MudrodEngine.java:240)
at org.apache.sdap.mudrod.main.MudrodEngine.main(MudrodEngine.java:385)
{quote}
was:
Trying to ingest January 2018 logs results in error
{quote}
java.lang.IllegalArgumentException: [size] must be greater than 0. Found [0] in
[Users]
at
org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder.size(TermsAggregationBuilder.java:148)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserTerms(LogAbstract.java:127)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserDocs(LogAbstract.java:135)
at
org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserRDD(LogAbstract.java:100)
at
org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.checkByRateInParallel(CrawlerDetection.java:112)
at
org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.execute(CrawlerDetection.java:85)
at
org.apache.sdap.mudrod.discoveryengine.WeblogDiscoveryEngine.preprocess(WeblogDiscoveryEngine.java:112)
at
org.apache.sdap.mudrod.main.MudrodEngine.startFullIngest(MudrodEngine.java:240)
at org.apache.sdap.mudrod.main.MudrodEngine.main(MudrodEngine.java:385)
{quote}
> Error trying to ingest logs
> ---------------------------
>
> Key: SDAP-120
> URL: https://issues.apache.org/jira/browse/SDAP-120
> Project: Apache Science Data Analytics Platform
> Issue Type: Bug
> Components: mudrod
> Reporter: Frank Greguska
> Priority: Blocker
>
> Trying to ingest January 2018 logs results in error
>
> {quote}
> 2018-07-09 18:06:29,119 INFO server.Server (Server.java:doStart(379)) -
> Started @3794ms
> 2018-07-09 18:06:29,381 INFO handler.ContextHandler
> (ContextHandler.java:doStart(744)) - Started
> o.s.j.s.ServletContextHandler@11dcd42c{/metrics/json,null,AVAILABLE}
> 2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:<init>(51)) - Started Mudrod Weblog Discovery
> Engine.
> 2018-07-09 18:06:29,874 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:preprocess(98)) - Starting Web log preprocessing.
> 2018-07-09 18:06:29,875 INFO discoveryengine.WeblogDiscoveryEngine
> (WeblogDiscoveryEngine.java:preprocess(106)) - Processing logs dated 201801.gz
> 2018-07-09 18:06:30,013 INFO pre.ImportLogFile
> (ImportLogFile.java:execute(80)) - Starting Log Import 201801.gz
> 2018-07-09 18:06:31,084 INFO util.Version (Version.java:logVersion(108)) -
> Elasticsearch Hadoop v5.2.0 [d85a257f9f]
> 2018-07-09 18:06:31,451 INFO rdd.EsRDDWriter
> (RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.http]
> 2018-07-09 18:08:15,371 INFO rdd.EsRDDWriter
> (RestService.java:createWriter(562)) - Writing to [log201801.gz/raw.ftp]
> 2018-07-09 18:13:15,916 INFO pre.ImportLogFile
> (ImportLogFile.java:execute(84)) - Log Import complete. Time elapsed 405
> seconds
> 2018-07-09 18:13:15,925 INFO pre.CrawlerDetection
> (CrawlerDetection.java:execute(82)) - Starting Crawler detection raw.http
> java.lang.IllegalArgumentException: [size] must be greater than 0. Found [0]
> in [Users]
> at
> org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder.size(TermsAggregationBuilder.java:148)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserTerms(LogAbstract.java:127)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserDocs(LogAbstract.java:135)
> at
> org.apache.sdap.mudrod.weblog.pre.LogAbstract.getUserRDD(LogAbstract.java:100)
> at
> org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.checkByRateInParallel(CrawlerDetection.java:112)
> at
> org.apache.sdap.mudrod.weblog.pre.CrawlerDetection.execute(CrawlerDetection.java:85)
> at
> org.apache.sdap.mudrod.discoveryengine.WeblogDiscoveryEngine.preprocess(WeblogDiscoveryEngine.java:112)
> at
> org.apache.sdap.mudrod.main.MudrodEngine.startFullIngest(MudrodEngine.java:240)
> at org.apache.sdap.mudrod.main.MudrodEngine.main(MudrodEngine.java:385)
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)