Hello Scott, What you typically do is that you decide which separator will give you the most granular split (in your case comma) and then use SQL constructs to further transform the returned columns of the set to the structure you would like to have. In SQL you can always create additional fields. I understand that is is probably the single thing you would like to avoid. Alternatively you create your own plugin that takes a set of separators as a parameter to the storage plugin, split the ingoing records based on the set and basically reuse all the other existing classes for the storage plugin. I personally believe it should be quite easy to accomplished by extending existing code (even though I have not looked at the code for a considerable time now)
I added very primitive XML support using the existing JSON classes and extending the reader as needed. Regards, Magnus > 19 feb 2016 kl. 04:18 skrev Jim Scott <[email protected]>: > > The delimited file reader does not support that. > > On Thu, Feb 18, 2016 at 5:58 PM, Wilburn, Scott < > [email protected]> wrote: > >> Jim, >> Just to clarify, I'm trying to use Drill on files that contain records >> where the fields are delimited by multiple different characters. >> >> Example record: 10-20-16,4477,99;98,aab,99;66,aab >> >> Desired result: >> columns[0] = 10-20-16 >> columns[1] = 4477 >> columns[2] = 99 >> columns[3] = 98 >> columns[4] = aab >> columns[5] = 99 >> columns[6] = 66 >> columns[7] = aab >> >> In this example, the record contains 8 fields when split by comma and by >> semicolon. >> Is something like this possible? >> >> Thanks, >> Scott Wilburn >> >> >> -----Original Message----- >> From: Jim Scott [mailto:[email protected]] >> Sent: Thursday, February 18, 2016 03:46 PM >> To: user >> Subject: [E] Re: Multiple Delimiter Format >> >> Scott, >> >> You would need a format defined for each file type. e.g. csv has commas, >> tsv has tabs, so on >> >> If you are looking for multiple delimiters within the same file or >> potentially with a single file extension that isn't supported. >> >> Jim >> >> On Thu, Feb 18, 2016 at 5:43 PM, Wilburn, Scott < >> [email protected]> wrote: >> >>> Hello, >>> Is there a way to specify multiple delimiters when configuring a >>> storage plugin record format? For example, could I split records into >>> fields by comma or by semicolon characters. >>> >>> Thanks, >>> Scott Wilburn >>> >>> >> >> >> -- >> *Jim Scott* >> Director, Enterprise Strategy & Architecture >> +1 (347) 746-9281 >> @kingmesal <https://twitter.com/kingmesal> >> >> <http://www.mapr.com/> >> [image: MapR Technologies] <http://www.mapr.com> >> >> Now Available - Free Hadoop On-Demand Training < >> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available >>> >> > > > > -- > *Jim Scott* > Director, Enterprise Strategy & Architecture > +1 (347) 746-9281 > @kingmesal <https://twitter.com/kingmesal> > > <http://www.mapr.com/> > [image: MapR Technologies] <http://www.mapr.com> > > Now Available - Free Hadoop On-Demand Training > <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
