[
https://issues.apache.org/jira/browse/PIG-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153745#comment-15153745
]
Jagdish Kewat commented on PIG-4813:
------------------------------------
Thanks Daniel for responding. Here's the complete log that I could get.
{code}
2016-02-19 05:22:02,945 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
application appattempt_1455858325607_0002_000001
2016-02-19 05:22:03,559 WARN [main] org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable
2016-02-19 05:22:03,579 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-02-19 05:22:03,579 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN,
Service: , Ident: (appAttemptId { application_id { id: 2 cluster_timestamp:
1455858325607 } attemptId: 1 } keyId: 1797247994)
2016-02-19 05:22:03,787 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-02-19 05:22:04,842 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config
org.apache.hadoop.mapred.DirectFileOutputCommitter
2016-02-19 05:22:06,949 INFO [main] com.amazon.ws.emr.hadoop.fs.EmrFileSystem:
Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem
as filesystem implementation
2016-02-19 05:22:07,225 INFO [main] amazon.emr.metrics.MetricsSaver:
MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60
clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072
maxInstanceCount: 500 lastModified: 1455858335246
2016-02-19 05:22:07,226 INFO [main] amazon.emr.metrics.MetricsSaver: Created
MetricsSaver j-LFFAZA3YY2MP:i-64944bbc:MRAppMaster:13647 period:60
/mnt/var/em/raw/i-64944bbc_20160219_MRAppMaster_13647_raw.bin
2016-02-19 05:22:07,881 INFO [main] com.amazonaws.latency: StatusCode=[200],
ServiceName=[Amazon S3], AWSRequestID=[null],
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com],
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=0, ClientExecuteTime=[610.782],
HttpRequestTime=[545.302], HttpClientReceiveResponseTime=[43.907],
RequestSigningTime=[14.752], ResponseProcessingTime=[1.363],
HttpClientSendRequestTime=[1.409],
2016-02-19 05:22:08,093 INFO [main]
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem: listStatus
s3n://my-bucket/user/schema/my-schema.avsc with recursive false
2016-02-19 05:22:08,127 INFO [main] com.amazonaws.latency: StatusCode=[200],
ServiceName=[Amazon S3], AWSRequestID=[null],
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com],
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=1, ClientExecuteTime=[31.71],
HttpRequestTime=[30.035], HttpClientReceiveResponseTime=[26.537],
RequestSigningTime=[1.127], ResponseProcessingTime=[0.009],
HttpClientSendRequestTime=[1.029],
2016-02-19 05:22:08,421 INFO [main] com.amazonaws.latency: StatusCode=[200],
ServiceName=[Amazon S3], AWSRequestID=[D89C808B9B1677E2],
ServiceEndpoint=[https://my-bucket.s3-us-west-1.amazonaws.com],
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=1, ClientExecuteTime=[292.937],
HttpRequestTime=[277.019], HttpClientReceiveResponseTime=[68.209],
RequestSigningTime=[1.22], ResponseProcessingTime=[13.392],
HttpClientSendRequestTime=[1.032],
2016-02-19 05:22:08,509 INFO [main]
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem: Opening
's3n://my-bucket/user/schema/my-schema.avsc' for reading
2016-02-19 05:22:08,561 INFO [main] com.amazonaws.latency: StatusCode=[206],
ServiceName=[Amazon S3], AWSRequestID=[1EE49CF8FD820E4A],
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com],
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=2, ClientExecuteTime=[48.773],
HttpRequestTime=[42.801], HttpClientReceiveResponseTime=[39.706],
RequestSigningTime=[0.866], ResponseProcessingTime=[1.779],
HttpClientSendRequestTime=[1.112],
2016-02-19 05:22:08,564 INFO [main] amazon.emr.metrics.MetricsSaver: Thread 1
created MetricsLockFreeSaver 1
2016-02-19 05:22:08,841 INFO [main] com.amazon.ws.emr.hadoop.fs.EmrFileSystem:
Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem
as filesystem implementation
2016-02-19 05:22:09,026 INFO [main] com.amazonaws.latency: StatusCode=[200],
ServiceName=[Amazon S3], AWSRequestID=[null],
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com],
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=0, ClientExecuteTime=[184.297],
HttpRequestTime=[179.01], HttpClientReceiveResponseTime=[30.483],
RequestSigningTime=[0.578], ResponseProcessingTime=[0.01],
HttpClientSendRequestTime=[0.98],
2016-02-19 05:22:09,036 INFO [main] org.apache.hadoop.service.AbstractService:
Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Output schema is null!
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException:
Output schema is null!
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:473)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:453)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1542)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:453)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1500)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1497)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1430)
Caused by: java.io.IOException: Output schema is null!
at
org.apache.pig.piggybank.storage.avro.AvroStorage.getOutputFormat(AvroStorage.java:692)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:92)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:70)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:289)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:471)
... 11 more
{code}
Need to check with the suggested API. Would update as I make progress.
> AvroStorage doesn't work for schema from external file for EMR
> --------------------------------------------------------------
>
> Key: PIG-4813
> URL: https://issues.apache.org/jira/browse/PIG-4813
> Project: Pig
> Issue Type: Bug
> Reporter: Jagdish Kewat
>
> Hi Team,
> I couldn't get the schema loading for AvroStorage as described in
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-etl-avro.html
> working.
> It works fine if I provide the raw schema string with option 'schema' as
> described in https://cwiki.apache.org/confluence/display/PIG/AvroStorage.
> On HDFS I don't even need to specify the schema with store command.
> A quick insights regarding the versions.
> * Hadoop :
> {code}
> Hadoop 2.6.0-amzn-2
> Subversion [email protected]:/pkg/Aws157BigTop -r
> 41f4e6be3ac5d6676a3464f77de79a33e8fdd9f3
> Compiled by ec2-user on 2015-11-16T20:56Z
> Compiled with protoc 2.5.0
> {code}
> * Pig :
> {code}
> Apache Pig version 0.14.0-amzn-0 (r: unknown)
> {code}
> * piggybank jar version:
> ** piggybank-0.14.0.jar
> * avro jar version :
> ** avro-1.7.7.jar
> * avro-ipc jar version :
> ** avro-ipc-1.7.7.jar
> * json-simple jar version
> ** json-simple-1.1.jar
> I tried looking for any pibbybank version of jar for EMR however no luck. I
> fear I am not using correct versions of jars since the feature should work as
> it has been documented.
> Please advise if I am missing anything.
> Thanks,
> Jagdish
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)