Try this
This is for Oracle but should work for MSSQL. If you want ordering then do
it on DF
val d = HiveContext.load("jdbc",
Map("url" -> _ORACLEserver,
"dbtable" -> "(SELECT to_char(ID) AS ID, to_char(CLUSTERED) AS CLUSTERED,
to_char(SCATTERED) AS SCATTERED, to_char(RANDOMISED) AS RANDOMISED,
RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)",
"user" -> _username,
"password" -> _password))
*d.sort(asc("ID")).registerTempTable("tmp")*
I believe that will work.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
On 20 June 2016 at 12:10, Takeshi Yamamuro <[email protected]> wrote:
> Hi,
>
> Currently, no.
> spark cannot preserve the order of input data from jdbc.
> If you want to have the ordered ids, you need to sort them explicitly.
>
> // maropu
>
> On Mon, Jun 20, 2016 at 7:41 PM, Ashok Kumar <[email protected]
> > wrote:
>
>> Hi,
>>
>> I have a SQL server table with 500,000,000 rows with primary key (unique
>> clustered index) on ID column
>>
>> If I load it through JDBC into a DataFrame and register it
>> via registerTempTable will the data will be in the order of ID in tempTable?
>>
>> Thanks
>>
>
>
>
> --
> ---
> Takeshi Yamamuro
>