I used sqoop to import an MS SQL Server table into an Avro file on HDFS. No
problem. Then I tried to create an external Impala table using the following
DDL:
CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
LOCATION '/tmp/AvroTable';
I got the error "ERROR: AnalysisException: Error loading Avro schema: No Avro
schema provided in SERDEPROPERTIES or TBLPROPERTIES for table:
default.AvroTable"
So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar
(-getschema) into a JSON file, then per the recommendation above, changed the
DDL to point to it:
CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
LOCATION '/tmp/AvroTable'
TBLPROPERTIES(
'serialization.format'='1',
'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'
);
This worked fine. But my question is, why do you have to do this? The schema
is already in the Avro file - that's where I got the JSON schema file that I
point to in the TBLPROPERTIES parameter!
Thanks, Tom
Tom Vitale
CREDIT SUISSE
Information Technology | Infra Arch & Strategy NY, KIVP
Eleven Madison Avenue | 10010-3629 New York | United States
Phone +1 212 538 0708
[email protected]<mailto:[email protected]> |
www.credit-suisse.com<http://www.credit-suisse.com>
===============================================================================
Please access the attached hyperlink for an important electronic communications
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================