Hi, I am evaluating sqoop to do DB extracts from our relational stores. The hadoop cluster running in production for us is Hadoop 0.20.append. According to the sqoop introduction page on github:
"Sqoop relies on advanced features of Apache Hadoop. As such, it requires the latest beta of Cloudera’s Distribution for Hadoop (CDH3 beta 2). Sqoop may be compatible with the Apache 0.21.0 release, but this is considered experimental and should not be used in production. The COMPILING.txt file describes how to select a Hadoop distribution to target at compilation time." Does this still hold true ? All I want to do is incrementally import tables from an Oracle database. Can someone explain what features are missing from the non cloudera distributions and why is it unsafe to use them in production ? Thank you.