Hi, I have already set the plugin configuration to extractheader :true . and I followed the below link https://drill.apache.org/docs/lesson-2-run-queries-with-ansi-sql/
SELECT REGEXP_REPLACE(CAST(`Column1` AS VARCHAR(100)), '[,".]', '') AS `Col1` FROM dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10.0.tar/apache-drill-1.10.0/sample-data/sample_data.csv` Just extracting column which has special charcaters including the delimeter as one of the special character gives me empty result set . Am I missing something ? Appreciate the help. Thanks, Divya On 27 July 2017 at 12:23, Paul Rogers <[email protected]> wrote: > Hi Divya, > > I presume that “sample_data.csv” is your file? The default CSV > configuration reads files without headers and puts all columns into a > single array called “columns”. Do a SELECT * and you’ll see it. You’ll see > an array that contains your data: > > [“Fred”, “Flintstone”] > > So, the correct query would be: > > SELECT REGEXP_REPLACE(CAST(columns[0] AS VARCHAR(100)), '[,".]', '') FROM > dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10. > 0.tar/apache-drill-1.10.0/sample-data/sample_data.csv` > > I notice the regex is messing with brackets. Are you trying to convert the > array form shown above to a cleaner list? Won’t work: those brackets are > not in the data; they are the textual sugar added to show the array when > printing. > > Maybe what you want is: > > SELECT columns[0] as `a`, columns[1] as `b` … > > Or, if your file actually contains headers, use a table function (or > storage plugin config) to specify to use the headings to create individual > columns. See the example at [1] under “Using the Formats Attributes as > Table Function Parameters”. > > - Paul > > [1] https://drill.apache.org/docs/plugin-configuration-basics/ > > > On Jul 26, 2017, at 8:22 PM, Divya Gehlot <[email protected]> > wrote: > > > > The another thing which I observed is when I run below query > > SELECT REGEXP_REPLACE('"This, col7 data yes."', '[,".]', '') FROM > > (VALUES(1)) > > EXPR$0 > > This col7 data yes > > > > > > Same when I run the csv file it gives me empty result set : > > SELECT REGEXP_REPLACE(CAST(`Column1` AS VARCHAR(100)), '[,".]', '') FROM > > dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10. > 0.tar/apache-drill-1.10.0/sample-data/sample_data.csv` > > > > EXPR$0 > > > > > > P.S. As advised all the above queries I ran on Drill 1.11 > > > > Appreciate the help . > > > > Thanks, > > Divya > > > > On 27 July 2017 at 09:54, Divya Gehlot <[email protected]> wrote: > > > >> Hi, > >> Please find attached the sample_data.csv file > >> Pasting the content of the csv file below , in case attachment doesn't > >> reach > >> > >>> Column1,Column2,Column3,Column4,Column5 > >>> colonedata1,coltwodata1,-35.924476,138.5987123, > >>> colonedata2,coltwodata2,-27.4372536,153.0304583,137 > >>> colonedata3,coltwodata3,-35.2793885,149.1233503,134 > >>> colonedata4,coltwodata4,-33.8724176,151.2067579, > >>> colonedata5,coltwodata5,,, > >>> "This, col6 data",coltwodata6,-33.869732,151.2055553,351 > >>> "This, col7 data yes.",coltwodata7,1.2845045,103.8482739,80 > >>> Chifley,coltwodata5,,, > >> > >> > >> Error : > >> > >>> Query Failed: An Error Occurred > >>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > >>> IllegalArgumentException: reallocation size must be non-negative > Fragment > >>> 0:0 > >> > >> > >> > >> Thanks all for the help. > >> > >> Thanks , > >> Divya > >> > >> > >> On 26 July 2017 at 23:31, Paul Rogers <[email protected]> wrote: > >> > >>> Hi Divya, > >>> > >>> We found a couple of issues in CSV files that would lead to the kind of > >>> errors you encountered. These issues will be fixed in the upcoming > Drill > >>> 1.11 release. > >>> > >>> Sharing a sample CSV file will let us check the issue. Even better, > >>> voting is open for the 1.11 release. Please go ahead and download it > and > >>> try your file with that release. Let us know if you still have a > problem. > >>> > >>> Thanks, > >>> > >>> - Paul > >>> > >>>> On Jul 26, 2017, at 6:14 AM, Khurram Faraaz <[email protected]> wrote: > >>>> > >>>> Can you please share your CSV file, the SQL query and the version of > >>> Drill that you are on. So someone can take a look and try to reproduce > the > >>> error that you are seeing. > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Khurram > >>>> > >>>> ________________________________ > >>>> From: Divya Gehlot <[email protected]> > >>>> Sent: Wednesday, July 26, 2017 3:18:08 PM > >>>> To: [email protected] > >>>> Subject: regex replace in string > >>>> > >>>> Hi, > >>>> I have a CSV file where column values are > >>>> "This is the column,one " > >>>> "This is column , two" > >>>> column3 > >>>> column4 > >>>> > >>>> When I try to regex_replace it throws error > >>>> > >>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > >>>>> IllegalArgumentException: reallocation size must be non-negative > >>> Fragment > >>>>> 0:0 > >>>> > >>>> > >>>> How can I read the above columns as one string like This is the > column > >>>> one > >>>> Appreciate the help > >>>> > >>>> Thanks, > >>>> Divyab > >>> > >>> > >> > >
