Rafael, Clark is using the filesystem plugin to query a Hadoop cluster. It seems weird that you can enumerate the files in a directory but when you try to query that file, it breaks... -- C
> On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <[email protected]> wrote: > > Hi all, > > It looks like the file is 644 already which should be good. > I'm confused why the schema is called hdfs. dfs is a pre-built schema for > HDFS and querying against flat files such as .json as you're trying to do. > The default config for dfs also has a lot more content than what you > pasted. Can you use the default and try again? > > Hope this helps, > Rafael > > > On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <[email protected]> wrote: > >> Hi Clark, >> That's strange. My initial thought is that this could be a permission >> issue. However, it might also be that Drill isn't finding the file for >> some reason. >> >> Could you try: >> >> SELECT * >> FROM hdfs.`<full hdfs path to file>` >> >> Best, >> --- C >> >> >>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <[email protected]> >> wrote: >>> >>> This is in 1.17. I can use SHOW FILES to list the file I'm targeting, >> but I cannot query it: >>> >>> apache drill> show files in hdfs.root.`/tmp/employee.json`; >>> >> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+ >>> | name | isDirectory | isFile | length | owner | group >> | permissions | accessTime | modificationTime | >>> >> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+ >>> | employee.json | false | true | 474630 | me | supergroup >> | rw-r--r-- | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 | >>> >> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+ >>> 1 row selected (3.039 seconds) >>> >>> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`; >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: >> Object '/tmp/employee.json' not found within 'hdfs.root' >>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0) >>> >>> The storage plugin uses the standard json config: >>> >>> "json": { >>> "type": "json", >>> "extensions": [ >>> "json" >>> ] >>> }, >>> >>> I can't see any problems on the HDFS side. Full stack trace is below. >>> >>> Any ideas what could be causing this behavior? >>> >>> Thanks, Clark >>> >>> >>> >>> FULL STACKTRACE: >>> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`; >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: >> Object '/tmp/employee.json' not found within 'hdfs.root' >>> >>> >>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ] >>> >>> (org.apache.calcite.runtime.CalciteContextException) From line 1, >> column 15 to line 1, column 18: Object '/tmp/employee.json' not found >> within 'hdfs.root' >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2 >>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62 >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45 >>> java.lang.reflect.Constructor.newInstance():423 >>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463 >>> org.apache.calcite.sql.SqlUtil.newContextException():824 >>> org.apache.calcite.sql.SqlUtil.newContextException():809 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805 >>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127 >>> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177 >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109 >>> >> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091 >>> >> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363 >>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60 >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955 >>> org.apache.calcite.sql.SqlSelect.validate():216 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637 >>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92 >>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590 >>> org.apache.drill.exec.work.foreman.Foreman.run():275 >>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142 >>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617 >>> java.lang.Thread.run():745 >>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException) >> Object '/tmp/employee.json' not found within 'hdfs.root' >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2 >>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62 >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45 >>> java.lang.reflect.Constructor.newInstance():423 >>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463 >>> org.apache.calcite.runtime.Resources$ExInst.ex():572 >>> org.apache.calcite.sql.SqlUtil.newContextException():824 >>> org.apache.calcite.sql.SqlUtil.newContextException():809 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805 >>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127 >>> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177 >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109 >>> >> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091 >>> >> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363 >>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60 >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955 >>> org.apache.calcite.sql.SqlSelect.validate():216 >>> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930 >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637 >>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199 >>> >> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127 >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92 >>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590 >>> org.apache.drill.exec.work.foreman.Foreman.run():275 >>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142 >>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617 >>> java.lang.Thread.run():745 (state=,code=0) >> >>
