Try: REGEXP_REPLACE(net,'[^\\d\\.]',''); Thank you, Kind Regards ~Maciek
On 15 January 2016 at 17:33, Mich Talebzadeh <m...@peridale.co.uk> wrote: > OK this is a convoluted way of doing it using SUBSTR and REGEXP_REPLACE > UDF to get rid of ‘?’ and commas in the curremcy imported from CSV file > > > > > > > > 0: jdbc:hive2://rhes564:10010/default> select net, > cast(REGEXP_REPLACE(SUBSTR(net,2,20),",","") AS DECIMAL(10,2)) AS > net_in_currency from t2; > > +--------------+------------------+--+ > > | net | net_in_currency | > > +--------------+------------------+--+ > > | ?182,531.25 | 182531.25 | > > | | NULL | > > | ?113,500.00 | 113500 | > > | ?69,031.25 | 69031.25 | > > > > But I am sure there are better ways. Well I know one can use sed and awk > to do the same in the original csv file at the OS level before creating > the external table but that beats the ELT spirit! > > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk] > *Sent:* 15 January 2016 16:08 > *To:* user@hive.apache.org > *Subject:* RE: Importing currency figures with currency indicator to Hive > from CSV > > > > Hi, > > > > What is the equivalent of Oracle TO_NUMBER function in Hive please? > > > > Thanks > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk <m...@peridale.co.uk>] > > *Sent:* 15 January 2016 15:11 > *To:* user@hive.apache.org > *Subject:* Importing currency figures with currency indicator to Hive > from CSV > > > > Hi, > > > > > > How to do convert a column stored as String in Hive into Decimal if > possible. > > > > The excel looks like this > > > > *Invoice Number* > > *Payment date* > > *Net* > > *VAT* > > *Total* > > 360 > > *10/02/2014* > > *£10,000.00* > > £2000.00 > > £12,000.00 > > > > And the file (before bzip2) looks like this on imported directory in Linux > > Invoice Number,Payment date,Net,VAT,Total > > 360,10/02/2014,*"▒12,000.00"**,▒*2000.00,"▒12,000.00" > > > > Once the data inserted from the external table into Hive (as string), > looks like below > > > +-------------------+-----------------+--------------+-------------+--------------+--+ > > | t2.invoicenumber | t2.paymentdate | t2.net | t2.vat | > t2.total | > > > +-------------------+-----------------+--------------+-------------+--------------+--+ > > | 360 | 10/02/2014 | *?10,000.00* | ?2000.00 | > ?12,000.00 | > > | > > Is there any way to resolve this column to decimal format? > > > > Thanks > > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > >