I have a csv file that causes an exception when read by Drill. The file is slightly mal-formed (but R can read it).
Interestingly, if I don't parse the header line, I don't get the exception and the problematic embedded quotes are handled well. Likewise, deleting the first data line (which is well-formed) causes the exception to go away. Deleting the second data line also causes the exception to stop. Fixing the quoting of the included quotes also fixes the problem. Swapping the lines works like deleting the first line. Repeating the first line after the second line still gets the exception. The file is this: ------------------------- desc,name "foo","x" "manure called "foo"","y" ------------- The exception is shown below. My thought is that if the CSV file is considered mal-formed, we should get an error on the line that says something along the lines of "mal-formed input". Even better would be to allow such lines to be omitted (up to some sanity limit) or to parse it correctly (which happens without headers being parsed). Anybody have any thoughts? Here is the R behavior (it omits the embedded quotes): > f = read.csv("v.csv") > f desc name 1 foo x 2 manure called foo y And here is the exception: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NegativeArraySizeException Please, refer to logs for more information. [Error Id: 7153f837-45eb-43d1-8e19-e3ca0197c61b ] (java.lang.NegativeArraySizeException) null org.apache.drill.exec.vector.VarCharVector$Accessor.get():487 org.apache.drill.exec.vector.VarCharVector$Accessor.getObject():514 org.apache.drill.exec.vector.VarCharVector$Accessor.getObject():475 org.apache.drill.exec.server.rest.WebUserConnection.sendData():147 org.apache.drill.exec.ops.AccountingUserConnection.sendData():42 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():120 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1669 org.apache.drill.exec.work.fragment.FragmentExecutor.run():283 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748