Hi I guess have spotted some minor error in the sqoop documentation: [1] : in the table, it says direct mode is enable for postgres only for import. That's wrong, export too is enabled. [2] : the psql is not needed for both import/export
---------- Now my questions: I have been able to load data from all formats (including orc) to postgresql with sqoop export in **no direct** mode. While robust, it uses the jdbc insert prepared statement and it is way too slow, even parallelized. I have been able to load data from **csv only** format with sqoop export in **direct mode**. While very fast (parallel copy statements!), the method is not robust in case the data do have varchar columns. In particular a varchar column may contain **newlines** and this breaks the mapper job, since it splits the csv by newlines. That's too bad, because the *copy* statement can handle *newlined csv*. 1) Is there any way to only send a whole hdfs file per mapper instead of splitting them ? That would work well. 2) Any plan to allow sqoop export from orc in direct mode ? Thanks, [1]: https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#_supported_databases [2]: https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#_requirements_2 -- nicolas