Good day, In my situation my table has billion rows, it doesn't come with an integer column as its key, that means if I use sqoop to do the import (into hive), I would not be able to use multiple mapper. As table's size is big, it is not realistic to add an extra new integer field to it.
I do come across a post from hortonworks which seems to suggest it is possible however was commented that: 1. no guarantees though that sqoop splits your records evenly over your mappers though. 2. For huge number of row the above options will cause duplicates in the results set. https://community.hortonworks.com/questions/26961/sqoop-split-by-on-a-string-varchar-column.html Any thought? Thank you very much. *------------------------------------------------* *Sincerely yours,* *Raymond*