[ 
https://issues.apache.org/jira/browse/METRON-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Allen updated METRON-2038:
-------------------------------
    Fix Version/s: 0.7.1

> Enrichment Loader Fails When Run as MR Job
> ------------------------------------------
>
>                 Key: METRON-2038
>                 URL: https://issues.apache.org/jira/browse/METRON-2038
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>            Priority: Major
>             Fix For: 0.7.1
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The enrichment loader fails when run as an MR job on YARN. It runs 
> successfully when run in local mode.
> The following exception occurs inside the YARN container.
> {code}
> 2019-03-13 16:14:28,391 FATAL [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.lang.NoSuchMethodError: 
> org.apache.hadoop.hbase.HBaseConfiguration.createClusterConf(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;Ljava/lang/String;)Lorg/apache/hadoop/conf/Configuration;
>  at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:204)
>  at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
>  at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:517)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:501)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1640)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:501)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:287)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1598)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1595)
>  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1526)
> 2019-03-13 16:14:28,394 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {code}
> Steps to Replicate
> 1. Create a data set of enrichments to load.
> {code}
> [root@node1 0.7.1]# cat alexa.csv
> 1,google.com
> 2,youtube.com
> 3,facebook.com
> 4,baidu.com
> 5,wikipedia.org
> 6,yahoo.com
> 7,google.co.in
> 8,reddit.com
> 9,qq.com
> 10,amazon.com
> 11,taobao.com
> 12,google.co.jp
> 13,twitter.com
> 14,tmall.com
> 15,vk.com
> 16,live.com
> 17,instagram.com
> 18,sohu.com
> 19,sina.com.cn
> 20,weibo.com
> 21,jd.com
> 22,360.cn
> 23,google.de
> 24,google.co.uk
> 25,google.ru
> 26,google.fr
> 27,google.com.br
> 28,list.tmall.com
> 29,linkedin.com
> 30,google.com.hk
> 31,netflix.com
> 32,yandex.ru
> 33,google.it
> 34,yahoo.co.jp
> 35,google.es
> 36,t.co
> 37,pornhub.com
> 38,ebay.com
> 39,imgur.com
> 40,google.com.mx
> 41,google.ca
> 42,alipay.com
> 43,twitch.tv
> 44,xvideos.com
> 45,bing.com
> 46,youth.cn
> 47,msn.com
> 48,aliexpress.com
> 49,tumblr.com
> 50,ok.ru
> {code}
> 2. Push the data to HDFS.
> {code}
> hdfs dfs -put alexa.csv /tmp
> {code}
> 3. Create the enrichment definition.
> {code}
> [root@node1 0.7.1]# cat enrichment.json
> {
>  "zkQuorum":"node1:2181",
>  "sensorToFieldList":{
>  "squid":{
>  "type":"ENRICHMENT",
>  "fieldToEnrichmentTypes":{
>  "domain_without_subdomains":[
>  "whois",
>  "alexa"
>  ]
>  }
>  }
>  }
> }
> {code}
> 4. Create the extractor definition.
> {code}
> [root@node1 0.7.1]# cat extractor.json
> {
>  "config" : {
>  "columns" : {
>  "domain" : 1,
>  "rank" : 0
>  }
>  ,"indicator_column" : "domain"
>  ,"type" : "alexa"
>  ,"separator" : ","
>  },
>  "extractor" : "CSV"
> }
> {code}
> 5. Execute the loader.
> {code}
> /usr/metron/0.7.1/bin/flatfile_loader.sh -n ./enrichment.json -t enrichment 
> -c t -e ./extractor.json -i /tmp/alexa.csv -m MR
> 19/03/13 16:12:26 WARN extractor.TransformFilterExtractorDecorator: Unable to 
> setup zookeeper client - zk_quorum url not provided. **This will limit some 
> Stellar functionality**
> 19/03/13 16:12:26 INFO importer.MapReduceImporter: Configuring 
> MapReduceImporter: /tmp/alexa.csv => enrichment:t
> 19/03/13 16:12:27 INFO client.RMProxy: Connecting to ResourceManager at 
> node1/127.0.0.1:8050
> 19/03/13 16:12:27 INFO client.AHSProxy: Connecting to Application History 
> server at node1/127.0.0.1:10200
>  
> 19/03/13 16:14:09 INFO input.FileInputFormat: Total input paths to process : 1
> 19/03/13 16:14:10 INFO mapreduce.JobSubmitter: number of splits:1
> 19/03/13 16:14:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1552492524533_0003
> 19/03/13 16:14:12 INFO impl.YarnClientImpl: Submitted application 
> application_1552492524533_0003
> 19/03/13 16:14:12 INFO mapreduce.Job: The url to track the job: 
> http://node1:8088/proxy/application_1552492524533_0003/
> 19/03/13 16:14:12 INFO mapreduce.Job: Running job: job_1552492524533_0003
> 19/03/13 16:14:33 INFO mapreduce.Job: Job job_1552492524533_0003 running in 
> uber mode : false
> 19/03/13 16:14:33 INFO mapreduce.Job: map 0% reduce 0%
> 19/03/13 16:14:33 INFO mapreduce.Job: Job job_1552492524533_0003 failed with 
> state FAILED due to: Application application_1552492524533_0003 failed 2 
> times due to AM Container for appattempt_1552492524533_0003_000002 exited 
> with exitCode: 1
> For more detailed output, check the application tracking page: 
> http://node1:8088/cluster/app/application_1552492524533_0003 Then click on 
> links to logs of each attempt.
> Diagnostics: Exception from container-launch.
> Container id: container_e01_1552492524533_0003_02_000001
> Exit code: 1
> {code}
> 6. The root cause exception is visible in the YARN logs or the application 
> tracker UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to