Drill docs mention that REGEXP_REPLACE function uses 
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html


Link to REGEXP_REPLACE doc - 
https://drill.apache.org/docs/string-manipulation/#regexp_replace


________________________________
From: Paul Rogers <[email protected]>
Sent: Thursday, July 27, 2017 11:38:42 AM
To: [email protected]
Subject: Re: regex replace in string

DRILL-4645: "Regex_replace() function is broken”?

I still wonder, the characters you are looking for are those used to format a 
“columns” column on output. Can you show a couple of lines of your CSV file?

Or, take a look at SELECT * FROM … LIMIT 10 to see if the data is being split 
into columns, or is somehow using the “columns” column…

That aside, the docs don’t state the regex syntax used in the REGEX_REPLACE 
function. Is it Java regex? If so, then you’d have to format it according to 
the Java syntax ([1]). So:

‘\[|,|\.|]’

Note that “[“ and “.” are part of the regex syntax, so must be escaped.

- Paul

[1] https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

> On Jul 26, 2017, at 10:49 PM, Divya Gehlot <[email protected]> wrote:
>
> Hi,
> I have already set the plugin configuration to extractheader :true .
> and I followed the below link
> https://drill.apache.org/docs/lesson-2-run-queries-with-ansi-sql/
>
> SELECT REGEXP_REPLACE(CAST(`Column1` AS VARCHAR(100)), '[,".]', '') AS
> `Col1` FROM
> dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10.0.tar/apache-drill-1.10.0/sample-data/sample_data.csv`
>
> Just extracting column which has special charcaters including the delimeter
> as one of the special character  gives me empty result set .
>
> Am I missing something ?
>
> Appreciate the help.
>
> Thanks,
> Divya
>

Reply via email to