Based on our experience, it's better not to use Sqoop to create Parquet
files.
Even if you manage to achieve that you create a parquet file then you will
have ridiculous data type problems when it comes to working with Hive
metastore.
I recommend Spark SQL when it comes to creating Parquet files.
It works very flawlessly.


On Mon, Feb 25, 2019 at 12:54 PM Markus Kemper <mar...@cloudera.com> wrote:

> To the best of my knowledge the only way to use Sqoop export with Parquet
> is via the --hcat options, sample below
>
> sqoop export --connect $MYSQL_CONN --username $MYSQL_USER --password
> $MYSQL_PSWD --table t2 --num-mappers 1 --hcatalog-database default
> --hcatalog-table t1_parquet_table
>
>
> Markus Kemper
> Cloudera Support
>
>
>
>
> On Mon, Feb 25, 2019 at 12:36 PM Preethi Krishnan <pkrish...@pandora.com>
> wrote:
>
>>
>>
>> Hi,
>>
>>
>>
>> I’m using the scoop Hadoop jar to scoop the data from Postgres to Google
>> Cloud (GCS). It is working fine for text format. But I’m unable to load it
>> in the parquet format. It does not fail but it does not load the data
>> either.The jar file I’m using is sqoop-1.4.7-hadoop260.jar.
>>
>>
>>
>> Is there any specific way I should be loading the data in parquet format
>> using sqoop?
>>
>>
>>
>>
>>
>> Thanks
>>
>> Preethi
>>
>>
>>
>>
>>
>

Reply via email to