Hi Sneh, I have tried the same scenario with S3 and got the following error:
17/02/10 04:09:04 ERROR tool.ImportTool: Imported Failed: Wrong FS: s3a://..., expected: hdfs://... I guess this is the same you have seen on your side. I have debugged the code and this error comes from the org.apache.sqoop.tool.ImportTool#initIncrementalConstraints method but I can't see a straightforward solution to it. Hadoop itself supports S3 and that is why some scenarios with S3 work but Sqoop does not have a full S3 support. If you think you can create a new feature request a JIRA I think the community will work more on S3 related features in the future. Regards, Szabolcs On Mon, Feb 6, 2017 at 9:52 PM, Anna Szonyi <[email protected]> wrote: > Hi Sneh, > > Currently the exclude tables is implemented with contains on the array of > tables: > for (String tableName : tables) { > > if (excludes.contains(tableName)) { > > System.out.println("Skipping table: " + tableName); > > } > ... > > So it currently doesn't work, however adding support for some sort of > wildcard wouldn't be too difficult. > If this is something you need, it might make sense to create a jira > <https://issues.apache.org/jira/browse/SQOOP/> for it, with your usecase. > > Thanks, > Anna > > > > On Sun, Feb 5, 2017 at 8:34 PM, Sneh <[email protected]> > wrote: > >> Hi Liz, >> >> I tried running the following command (create a job and then exec) to >> incremental fetch data to S3 (on AWS EMR cluster with EMRFS consistent >> view). >> sqoop job --create incre_reservation -- import --connect >> "jdbc:postgresql://rds-replica-hmssync.XXX.rds.amazonaws.com/hms" >> --username XXX --password XXX --table reservationbooking --incremental >> lastmodified --check-column modified_at --target-dir >> "s3://platform-poc/sqoop/reservation/incre" >> >> The error which I get says that FS should be HDFS and not S3. >> I came up with *alternate* approach to "delta fetch" the data to HDFS >> and then run merge command. >> >> I wanted to check if the "hop" to HDFS can be saved and direct merge >> could happen at S3. >> >> I got an another question, unrelated to the above: >> -> Is there a way I can use wildcards to exclude tables (without >> specifying the exact table names) while importing all the tables? >> >> Thanks for your time! >> >> >> Wishes, >> Sneh >> 8884383482 <(888)%20438-3482> >> >> On Fri, Feb 3, 2017 at 5:24 PM, Erzsebet Szilagyi < >> [email protected]> wrote: >> >>> Hi Sneh, >>> Could you give us a sample command that you are trying to run? >>> Thanks, >>> Liz >>> >>> On Thu, Jan 19, 2017 at 1:36 PM, Sneh <[email protected]> >>> wrote: >>> >>>> Dear Sqoop users, >>>> >>>> I've spawned an EMR cluster with Sqoop 1.4.6 and trying to "increment >>>> fetch" data from RDS to S3. >>>> The error I get is that FS should be HDFS and not S3. >>>> >>>> My EMR cluster is enabled for EMRFS consistent view. >>>> I am trying to build a pipeline from RDS to S3. Need help in direction >>>> to how to proceed when increment Sqoop job is unable to write to S3. >>>> >>>> Please help! >>>> >>>> >>>> Wishes, >>>> Sneh >>>> 8884383482 <(888)%20438-3482> >>>> >>>> >>>> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg> >>> >>> >>> >>> >>> -- >>> Erzsebet Szilagyi >>> Software Engineer >>> [image: www.cloudera.com] <http://www.cloudera.com> >>> >> >> >> >> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg> > > > -- Szabolcs Vasas Software Engineer <http://www.cloudera.com>
