[ 
https://issues.apache.org/jira/browse/PIG-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153745#comment-15153745
 ] 

Jagdish Kewat commented on PIG-4813:
------------------------------------

Thanks Daniel for responding. Here's the complete log that I could get.

{code}
2016-02-19 05:22:02,945 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for 
application appattempt_1455858325607_0002_000001
2016-02-19 05:22:03,559 WARN [main] org.apache.hadoop.util.NativeCodeLoader: 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
2016-02-19 05:22:03,579 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-02-19 05:22:03,579 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, 
Service: , Ident: (appAttemptId { application_id { id: 2 cluster_timestamp: 
1455858325607 } attemptId: 1 } keyId: 1797247994)
2016-02-19 05:22:03,787 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-02-19 05:22:04,842 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config 
org.apache.hadoop.mapred.DirectFileOutputCommitter
2016-02-19 05:22:06,949 INFO [main] com.amazon.ws.emr.hadoop.fs.EmrFileSystem: 
Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem 
as filesystem implementation
2016-02-19 05:22:07,225 INFO [main] amazon.emr.metrics.MetricsSaver: 
MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 
clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 
maxInstanceCount: 500 lastModified: 1455858335246 
2016-02-19 05:22:07,226 INFO [main] amazon.emr.metrics.MetricsSaver: Created 
MetricsSaver j-LFFAZA3YY2MP:i-64944bbc:MRAppMaster:13647 period:60 
/mnt/var/em/raw/i-64944bbc_20160219_MRAppMaster_13647_raw.bin
2016-02-19 05:22:07,881 INFO [main] com.amazonaws.latency: StatusCode=[200], 
ServiceName=[Amazon S3], AWSRequestID=[null], 
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com], 
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, 
HttpClientPoolAvailableCount=0, ClientExecuteTime=[610.782], 
HttpRequestTime=[545.302], HttpClientReceiveResponseTime=[43.907], 
RequestSigningTime=[14.752], ResponseProcessingTime=[1.363], 
HttpClientSendRequestTime=[1.409], 
2016-02-19 05:22:08,093 INFO [main] 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem: listStatus 
s3n://my-bucket/user/schema/my-schema.avsc with recursive false
2016-02-19 05:22:08,127 INFO [main] com.amazonaws.latency: StatusCode=[200], 
ServiceName=[Amazon S3], AWSRequestID=[null], 
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com], 
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, 
HttpClientPoolAvailableCount=1, ClientExecuteTime=[31.71], 
HttpRequestTime=[30.035], HttpClientReceiveResponseTime=[26.537], 
RequestSigningTime=[1.127], ResponseProcessingTime=[0.009], 
HttpClientSendRequestTime=[1.029], 
2016-02-19 05:22:08,421 INFO [main] com.amazonaws.latency: StatusCode=[200], 
ServiceName=[Amazon S3], AWSRequestID=[D89C808B9B1677E2], 
ServiceEndpoint=[https://my-bucket.s3-us-west-1.amazonaws.com], 
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, 
HttpClientPoolAvailableCount=1, ClientExecuteTime=[292.937], 
HttpRequestTime=[277.019], HttpClientReceiveResponseTime=[68.209], 
RequestSigningTime=[1.22], ResponseProcessingTime=[13.392], 
HttpClientSendRequestTime=[1.032], 
2016-02-19 05:22:08,509 INFO [main] 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem: Opening 
's3n://my-bucket/user/schema/my-schema.avsc' for reading
2016-02-19 05:22:08,561 INFO [main] com.amazonaws.latency: StatusCode=[206], 
ServiceName=[Amazon S3], AWSRequestID=[1EE49CF8FD820E4A], 
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com], 
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, 
HttpClientPoolAvailableCount=2, ClientExecuteTime=[48.773], 
HttpRequestTime=[42.801], HttpClientReceiveResponseTime=[39.706], 
RequestSigningTime=[0.866], ResponseProcessingTime=[1.779], 
HttpClientSendRequestTime=[1.112], 
2016-02-19 05:22:08,564 INFO [main] amazon.emr.metrics.MetricsSaver: Thread 1 
created MetricsLockFreeSaver 1
2016-02-19 05:22:08,841 INFO [main] com.amazon.ws.emr.hadoop.fs.EmrFileSystem: 
Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem 
as filesystem implementation
2016-02-19 05:22:09,026 INFO [main] com.amazonaws.latency: StatusCode=[200], 
ServiceName=[Amazon S3], AWSRequestID=[null], 
ServiceEndpoint=[https://my-bucket.s3.amazonaws.com], 
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, 
HttpClientPoolAvailableCount=0, ClientExecuteTime=[184.297], 
HttpRequestTime=[179.01], HttpClientReceiveResponseTime=[30.483], 
RequestSigningTime=[0.578], ResponseProcessingTime=[0.01], 
HttpClientSendRequestTime=[0.98], 
2016-02-19 05:22:09,036 INFO [main] org.apache.hadoop.service.AbstractService: 
Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; 
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.io.IOException: Output schema is null!
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: 
Output schema is null!
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:473)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:453)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1542)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:453)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1500)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1497)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1430)
Caused by: java.io.IOException: Output schema is null!
        at 
org.apache.pig.piggybank.storage.avro.AvroStorage.getOutputFormat(AvroStorage.java:692)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:92)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:70)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:289)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:471)
        ... 11 more
{code}

Need to check with the suggested API. Would update as I make progress.

> AvroStorage doesn't work for schema from external file for EMR
> --------------------------------------------------------------
>
>                 Key: PIG-4813
>                 URL: https://issues.apache.org/jira/browse/PIG-4813
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Jagdish Kewat
>
> Hi Team,
> I couldn't get the schema loading for AvroStorage as described in 
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-etl-avro.html
>  working. 
> It works fine if I provide the raw schema string with option 'schema' as 
> described in https://cwiki.apache.org/confluence/display/PIG/AvroStorage.
> On HDFS I don't even need to specify the schema with store command.
> A quick insights regarding the versions.
> * Hadoop :
> {code}
> Hadoop 2.6.0-amzn-2
> Subversion [email protected]:/pkg/Aws157BigTop -r 
> 41f4e6be3ac5d6676a3464f77de79a33e8fdd9f3
> Compiled by ec2-user on 2015-11-16T20:56Z
> Compiled with protoc 2.5.0
> {code}
> * Pig :
> {code}
> Apache Pig version 0.14.0-amzn-0 (r: unknown)
> {code}
> * piggybank jar version:
> ** piggybank-0.14.0.jar
> * avro jar version :
> ** avro-1.7.7.jar
> * avro-ipc jar version :
> ** avro-ipc-1.7.7.jar
> * json-simple jar version
> ** json-simple-1.1.jar
> I tried looking for any pibbybank version of jar for EMR however no luck. I 
> fear I am not using correct versions of jars since the feature should work as 
> it has been documented. 
> Please advise if I am missing anything.
> Thanks,
> Jagdish
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to