I'm having issues of doing a sqoop incemental import to hdfs. i get failures from multiple ways:
--connect 'jdbc:postgresql://www.example/dbname' --username blah -- query 'SELECT guidval::varchar(36), column2, datefield2, lastmodifieddate from table foo WHERE $CONDITIONS' --target-dir /hdfs/ path --split-by datefield2 --check-column lastmodifieddata --last- value '2011-11-03' --incremental lastmodified The error I get is Incremental requires an import. So I'm assuming I'm doing this incorrectly w/ the incremental import, where it won't work w/ a query. So I tried this then: --connect 'jdbc:postgresql://www.example/dbname' --username blah -- query 'SELECT guidval::varchar(36), column2, datefield2, lastmodifieddate from table foo WHERE $CONDITIONS or (lastmodifieddate > '2011-11-03')' --target-dir /hdfs/path --split-by datefield2 But I get an error of: ERROR: cannot cast type integer to timestamp without time zone when i try to convert to timestamp, that doesn't work. trying this this: --connect 'jdbc:postgresql://www.example/dbname' --username blah -- query 'SELECT guidval::varchar(36), column2, datefield2, lastmodifieddate from table foo' --where "lastmodifieddate > '2011-11-03' " --target-dir /hdfs/path --split-by datefield2 But the error I get, requires that I put a $CONDITIONS clause, which doesn't work either. I can't do the --table option, b/c one field is a UUID which apparently isn't supported by SQOOP as of yet. If so, it kept failing, until I did a full select converting the UUID column to varchar(36). Also, the timestamp entered is not current timestamp, it can be any timestamp that will be passed as a parameter. Thanks