[jira] [Created] (DRILL-7713) Upgrade misc libraries which outdated versions have reported vulnerabilities
Arina Ielchiieva created DRILL-7713: --- Summary: Upgrade misc libraries which outdated versions have reported vulnerabilities Key: DRILL-7713 URL: https://issues.apache.org/jira/browse/DRILL-7713 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 List of libraries to update: |commons-beanutils-1.9.2.jar| |jackson-databind-2.9.9.jar| |xalan-2.7.1.jar| |commons-compress-1.18.jar| |metadata-extractor-2.11.0.jar| |xercesImpl-2.11.0.jar| |retrofit-2.1.0.jar| |snakeyaml-1.23.jar| |commons-codec-1.10.jar| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7712) Fix issues after ZK upgrade
Arina Ielchiieva created DRILL-7712: --- Summary: Fix issues after ZK upgrade Key: DRILL-7712 URL: https://issues.apache.org/jira/browse/DRILL-7712 Project: Apache Drill Issue Type: Bug Affects Versions: 1.18.0 Reporter: Arina Ielchiieva Assignee: Vova Vysotskyi Fix For: 1.18.0 Warnings during jdbc-all build (absent when building with Mapr profile): {noformat} netty-transport-native-epoll-4.1.45.Final.jar, netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 46 overlapping classes: - io.netty.channel.epoll.AbstractEpollStreamChannel$2 - io.netty.channel.epoll.AbstractEpollServerChannel$EpollServerSocketUnsafe - io.netty.channel.epoll.EpollDatagramChannel - io.netty.channel.epoll.AbstractEpollStreamChannel$SpliceInChannelTask - io.netty.channel.epoll.NativeDatagramPacketArray - io.netty.channel.epoll.EpollSocketChannelConfig - io.netty.channel.epoll.EpollTcpInfo - io.netty.channel.epoll.EpollEventArray - io.netty.channel.epoll.EpollEventLoop - io.netty.channel.epoll.EpollSocketChannel - 36 more... netty-transport-native-unix-common-4.1.45.Final.jar, netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 15 overlapping classes: - io.netty.channel.unix.Errors$NativeConnectException - io.netty.channel.unix.ServerDomainSocketChannel - io.netty.channel.unix.DomainSocketAddress - io.netty.channel.unix.Socket - io.netty.channel.unix.NativeInetAddress - io.netty.channel.unix.DomainSocketChannelConfig - io.netty.channel.unix.Errors$NativeIoException - io.netty.channel.unix.DomainSocketReadMode - io.netty.channel.unix.ErrorsStaticallyReferencedJniMethods - io.netty.channel.unix.UnixChannel - 5 more... maven-shade-plugin has detected that some class files are present in two or more JARs. When this happens, only one single version of the class is copied to the uber jar. Usually this is not harmful and you can skip these warnings, otherwise try to manually exclude artifacts based on mvn dependency:tree -Ddetail=true and the above output. See http://maven.apache.org/plugins/maven-shade-plugin/ {noformat} Additional warning build with Mapr profile: {noformat} The following patterns were never triggered in this artifact inclusion filter: o 'org.apache.zookeeper:zookeeper-jute' {noformat} NPEs in tests (though tests do not fail): {noformat} [INFO] Running org.apache.drill.exec.coord.zk.TestZookeeperClient 4880 java.lang.NullPointerException 4881 at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269) 4882 at org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251) 4883 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583) 4884 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546) 4885 at org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java: {noformat} {noformat} [INFO] Running org.apache.drill.exec.coord.zk.TestEphemeralStore 5278 java.lang.NullPointerException 5279 at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269) 5280 at org.apache.zookeepe {noformat} {noformat} [INFO] Running org.apache.drill.yarn.zk.TestAmRegistration 6767 java.lang.NullPointerException 6768 at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269) 6769 at org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251) 6770 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583) 6771 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546) 6772 at org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:929) 6773 at org.apache.curator.t {noformat} {noformat} org.apache.drill.yarn.client.TestCommandLineOptions 6823 java.lang.NullPointerException 6824 at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269) 6825 at org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251) 6826 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583) 6827 at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546) 6828 at org.apac {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7710) Fix TestMetastoreCommands#testDefaultSegment test
Arina Ielchiieva created DRILL-7710: --- Summary: Fix TestMetastoreCommands#testDefaultSegment test Key: DRILL-7710 URL: https://issues.apache.org/jira/browse/DRILL-7710 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Vova Vysotskyi Fix For: 1.18.0 Test {{TestMetastoreCommands#testDefaultSegment}} sometimes fails: {noformat} [ERROR] TestMetastoreCommands.testDefaultSegment:1870 expected: but was: {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [NOTICE] Maven 3.6.3
Looks like decision was made to stay on the Maven 3.6.3. For those who want to follow the discussion, please see Jira and PR: https://issues.apache.org/jira/browse/DRILL-7708 <https://issues.apache.org/jira/browse/DRILL-7708> https://github.com/apache/drill/pull/2061 <https://github.com/apache/drill/pull/2061> Kind regards, Arina > On Apr 17, 2020, at 8:59 PM, Paul Rogers wrote: > > Hi Arina, > > Thanks for keeping us up to date! > > As it turns out, I use Ubuntu (Linux Mint) for development. Maven is > installed as a package using apt-get. Packages can lag behind a bit. The > latest maven available via apt-get is 3.6.0. > > It is a nuisance to install a new version outside the package manager. I > changed the Maven version in the root pom.xml to 3.6.0 and the build seemed > to work. Any reason we need the absolute latest version rather than just > 3.6.0 or later? > > The workaround for now is to manually edit the pom.xml file on each checkout, > then revert the change before commit. Can we maybe adjust the "official" > version instead? > > > Thanks, > - Paul > > > >On Friday, April 17, 2020, 5:09:49 AM PDT, Arina Ielchiieva > wrote: > > Hi all, > > Starting from Drill 1.18.0 (and current master from commit 20ad3c9 [1]), > Drill build will require Maven 3.6.3, otherwise build will fail. > Please make sure you have Maven 3.6.3 installed on your environments. > > [1] > https://github.com/apache/drill/commit/20ad3c9837e9ada149c246fc7a4ac1fe02de6fe8 > > Kind regards, > Arina
Re: [DISCUSS]: Masking Creds in Query Plans
Agree, that we should not display sensitive data, like passwords, I would say the best option is to mask it during output. Kind regards, Arina > On Apr 17, 2020, at 5:34 PM, Charles Givre wrote: > > Hello all, > I was thinking about this, if a user were to execute an EXPLAIN PLAN FOR > query, they get a lot of information about the storage plugin, including in > some cases creds. > The example below shows a query plan for the JDBC storage plugin. As you > can see, the user creds are right there > > I'm wondering would it be advisable or possible to mask the creds in query > plans so that users can't access this information? If masking it isn't an > option, is there some other way to prevent users from seeing this > information? In a multi-tenant environment, it seems like a rather large > security hole. > Thanks, > -- C > > > { > "head" : { >"version" : 1, >"generator" : { > "type" : "ExplainHandler", > "info" : "" >}, >"type" : "APACHE_DRILL_PHYSICAL", >"options" : [ ], >"queue" : 0, >"hasResourcePlan" : false, >"resultMode" : "EXEC" > }, > "graph" : [ { >"pop" : "jdbc-scan", >"@id" : 5, >"sql" : "SELECT *\nFROM `stats`.`batting`", >"columns" : [ "`playerID`", "`yearID`", "`stint`", "`teamID`", "`lgID`", > "`G`", "`AB`", "`R`", "`H`", "`2B`", "`3B`", "`HR`", "`RBI`", "`SB`", "`CS`", > "`BB`", "`SO`", "`IBB`", "`HBP`", "`SH`", "`SF`", "`GIDP`" ], >"config" : { > "type" : "jdbc", > "driver" : "com.mysql.cj.jdbc.Driver", > "url" : "jdbc:mysql://localhost:3306/?serverTimezone=EST5EDT", > "username" : "", > "password" : "", > "caseInsensitiveTableNames" : false, > "sourceParameters" : { }, > "enabled" : true >}, >"userName" : "", >"cost" : { > "memoryCost" : 1.6777216E7, > "outputRowCount" : 100.0 >} > }, { >"pop" : "limit", >"@id" : 4, >"child" : 5, >"first" : 0, >"last" : 10, >"initialAllocation" : 100, >"maxAllocation" : 100, >"cost" : { > "memoryCost" : 1.6777216E7, > "outputRowCount" : 10.0 >} > }, { >"pop" : "limit", >"@id" : 3, > >
[jira] [Created] (DRILL-7707) Unable to analyze table metadata is it resides in non-writable workspace
Arina Ielchiieva created DRILL-7707: --- Summary: Unable to analyze table metadata is it resides in non-writable workspace Key: DRILL-7707 URL: https://issues.apache.org/jira/browse/DRILL-7707 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Unable to analyze table metadata is it resides in non-writable workspace: {noformat} apache drill> analyze table cp.`employee.json` refresh metadata; Error: VALIDATION ERROR: Unable to create or drop objects. Schema [cp] is immutable. {noformat} Stacktrace: {noformat} [Error Id: b7f233cd-f090-491e-a487-5fc4c25444a4 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) at org.apache.drill.exec.planner.sql.SchemaUtilites.resolveToDrillSchemaInternal(SchemaUtilites.java:230) at org.apache.drill.exec.planner.sql.SchemaUtilites.resolveToDrillSchema(SchemaUtilites.java:208) at org.apache.drill.exec.planner.sql.handlers.DrillTableInfo.getTableInfoHolder(DrillTableInfo.java:101) at org.apache.drill.exec.planner.sql.handlers.MetastoreAnalyzeTableHandler.getPlan(MetastoreAnalyzeTableHandler.java:108) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163) at org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:274) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7706) Drill RDBMS Metastore
Arina Ielchiieva created DRILL-7706: --- Summary: Drill RDBMS Metastore Key: DRILL-7706 URL: https://issues.apache.org/jira/browse/DRILL-7706 Project: Apache Drill Issue Type: New Feature Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Currently Drill has only one Metastore implementation based on Iceberg tables. Iceberg tables are file based storage that supports concurrent writes / reads but required to be placed on distributed file system. This Jira aims to implement Drill RDBMS Metastore which will store Drill Metastore metadata in the database of the user's choice. Currently, PostgreSQL and MySQL databases are supported, others might work as well but no testing was done. Also out of box for demonstration / testing purposes Drill will setup SQLite file based embedded database but this is only applicable for Drill in embedded mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[NOTICE] Maven 3.6.3
Hi all, Starting from Drill 1.18.0 (and current master from commit 20ad3c9 [1]), Drill build will require Maven 3.6.3, otherwise build will fail. Please make sure you have Maven 3.6.3 installed on your environments. [1] https://github.com/apache/drill/commit/20ad3c9837e9ada149c246fc7a4ac1fe02de6fe8 Kind regards, Arina
[jira] [Created] (DRILL-7704) Update Maven dependency
Arina Ielchiieva created DRILL-7704: --- Summary: Update Maven dependency Key: DRILL-7704 URL: https://issues.apache.org/jira/browse/DRILL-7704 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Currently, minimal Maven version in Drill is 3.3.3, it's old contains dependency to the plexus-utils-3.0.20 library which has reported vulnerabilities. This Jira aims to update Maven version to 3.6.3. Having latest maven version is also crucial when using maven plugins that depend on the latest version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7528) Update Avro format plugin documentation
[ https://issues.apache.org/jira/browse/DRILL-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7528. - Resolution: Fixed Added in https://github.com/apache/drill/commit/0e17eea19aca27b88c98778fcfb7057a45501ab9. > Update Avro format plugin documentation > --- > > Key: DRILL-7528 > URL: https://issues.apache.org/jira/browse/DRILL-7528 > Project: Apache Drill > Issue Type: Task > Reporter: Arina Ielchiieva >Assignee: Vova Vysotskyi >Priority: Major > Fix For: 1.18.0 > > > Currently documentation states that Avro plugin is experimental. > As per Drill 1.17 / 1.18 it's code is pretty stable (since Drill 1.18 it uses > EVF). > Documentation should be updated accordingly. > https://drill.apache.org/docs/querying-avro-files/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7672) Make metadata type required when reading from / writing into Drill Metastore
Arina Ielchiieva created DRILL-7672: --- Summary: Make metadata type required when reading from / writing into Drill Metastore Key: DRILL-7672 URL: https://issues.apache.org/jira/browse/DRILL-7672 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Metastore consists of components: TABLES, VIEWS etc (so far only TABLES are implemented). Each component metadata can have types. For examples, TABLES metadata can be of the following types: TABLE, SEGMENT, FILE, ROW_GROUP, PARTITION. During initial Metastore implementation when reading from / writing into Metastore, metadata type was indicated in filter expressions. For Iceberg Metastore where all data is stored in files this was not this critical, basically when information is retrieved about the table, table folder is queried. For other Metastore implementations knowing metadata type can be more critical. For example, RDBMS Metastore would store TABLES metadata in different tables thus knowing which table to query would improve performance rather than trying to query all tables. Of course, we could traverse query filter and look for the hints which metadata type is needed but it is much better to know required metadata type beforehand without any extra logic. Taking into account that Metastore metadata is queried only in Drill code, developer knows beforehand what he needs to get / update / delete. This Jira aims to make metadata type required when reading from / writing into Drill Metastore. This change does not have any affect on the users, just internal code refactoring. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7665) Add UNION to schema parser
Arina Ielchiieva created DRILL-7665: --- Summary: Add UNION to schema parser Key: DRILL-7665 URL: https://issues.apache.org/jira/browse/DRILL-7665 Project: Apache Drill Issue Type: Improvement Reporter: Arina Ielchiieva Fix For: 1.18.0 After DRILL-7633 has defined proper type string for UNION it should be added to schema parser to allow proper ser / de. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7624) When Hive plugin is enabled with default config, cannot execute any SQL query
[ https://issues.apache.org/jira/browse/DRILL-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7624. - Resolution: Fixed Merged with commit id 63e64c2a8156a66ba1b1c8c1ec62e8da467bbbc9. > When Hive plugin is enabled with default config, cannot execute any SQL query > - > > Key: DRILL-7624 > URL: https://issues.apache.org/jira/browse/DRILL-7624 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Dmytro Kondriukov >Assignee: Paul Rogers >Priority: Major > Fix For: 1.18.0 > > > *Preconditions:* > Enable "hive" plugin, without editing configuration (default config) > *Steps:* > Run any valid query: > {code:sql} > SELECT 100; > {code} > *Expected result:* The query should be successfully executed. > *Actual result:* "UserRemoteException : INTERNAL_ERROR ERROR: Failure > setting up Hive metastore client." > {noformat} > org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: > Failure setting up Hive metastore client. > Plugin name hive > Plugin class org.apache.drill.exec.store.hive.HiveStoragePlugin > Please, refer to logs for more information. > [Error Id: db44f5c3-5136-4fc6-8158-50b63d775fe0 ] > {noformat} > > {noformat} > (org.apache.drill.common.exceptions.ExecutionSetupException) Failure > setting up Hive metastore client. > org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.():78 > org.apache.drill.exec.store.hive.HiveStoragePlugin.():77 > sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2 > sun.reflect.NativeConstructorAccessorImpl.newInstance():62 > sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45 > java.lang.reflect.Constructor.newInstance():423 > org.apache.drill.exec.store.ClassicConnectorLocator.create():274 > org.apache.drill.exec.store.ConnectorHandle.newInstance():98 > org.apache.drill.exec.store.PluginHandle.plugin():143 > > org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():616 > > org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():601 > org.apache.drill.exec.planner.sql.handlers.SqlHandlerConfig.getRules():48 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():367 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():351 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():338 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel():663 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198 > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():169 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():283 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():163 > org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():140 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():93 > org.apache.drill.exec.work.foreman.Foreman.runSQL():590 > org.apache.drill.exec.work.foreman.Foreman.run():275 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 > Caused By (org.apache.hadoop.hive.metastore.api.MetaException) Unable to > open a test connection to the given database. JDBC url = > jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true, username = > APP. Terminating connection pool (set lazyInit to true if you expect to start > your database after your app). Original Exception: -- > java.sql.SQLException: Failed to create database > '../sample-data/drill_hive_db', see the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source) > at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source) > at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source
[jira] [Created] (DRILL-7639) Replace DBCP2 with HikariCP in RDBMS (JDBC) plugin
Arina Ielchiieva created DRILL-7639: --- Summary: Replace DBCP2 with HikariCP in RDBMS (JDBC) plugin Key: DRILL-7639 URL: https://issues.apache.org/jira/browse/DRILL-7639 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Hikari is much faster and reliable than DBCP2. See comparison and benchmarks: https://beansroasted.wordpress.com/2017/07/29/connection-pool-analysis/ https://github.com/brettwooldridge/HikariCP -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7627) Update MySql version in JdbcStoragePlugin tests and cache ~/.embedmysql
Arina Ielchiieva created DRILL-7627: --- Summary: Update MySql version in JdbcStoragePlugin tests and cache ~/.embedmysql Key: DRILL-7627 URL: https://issues.apache.org/jira/browse/DRILL-7627 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 When running tests on clean env (or when ~/.embedmysql is cleaned), currently there are issues when downloading MySQL version v5_6_21, download hangs and JdbcStoragePlugin tests fail with timeout. {noformat} Download Version 5.6.21:Linux:B64 START Download Version 5.6.21:Linux:B64 DownloadSize: 311516309 Download Version 5.6.21:Linux:B64 0% Download Version 5.6.21:Linux:B64 5% Download Version 5.6.21:Linux:B64 10% Download Version 5.6.21:Linux:B64 15% Download Version 5.6.21:Linux:B64 20% Download Version 5.6.21:Linux:B64 25% Download Version 5.6.21:Linux:B64 30% Download Version 5.6.21:Linux:B64 35% Download Version 5.6.21:Linux:B64 40% Download Version 5.6.21:Linux:B64 45% Download Version 5.6.21:Linux:B64 50% Download Version 5.6.21:Linux:B64 55% Download Version 5.6.21:Linux:B64 60% TestJdbcPluginWithMySQLIT.initMysql:70 » Distribution java.net.SocketTimeoutEx. {noformat} Workaround is manually to download MySQL: {noformat} mkdir -p ~/.embedmysql/MySQL-5.6 wget -P ~/Downloads http://mirror.cogentco.com/pub/mysql/MySQL-5.6/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz cp ~/Downloads/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz ~/.embedmysql/MySQL-5.6/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz {noformat} Upgrading to the latest available MySQL version (5_7_19) fixes this issue. Also it would be nice to cache ~/.embedmysql folder during GitHub Actions CI run to spare time spent on downloading and unpacking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7617) Disabled storage plugins configuration is not displayed on Web UI
Arina Ielchiieva created DRILL-7617: --- Summary: Disabled storage plugins configuration is not displayed on Web UI Key: DRILL-7617 URL: https://issues.apache.org/jira/browse/DRILL-7617 Project: Apache Drill Issue Type: Bug Affects Versions: 1.18.0 Reporter: Arina Ielchiieva Assignee: Paul Rogers Fix For: 1.18.0 After DRILL-7590, disabled storage plugins are displayed on Web UI but if you press update button their content is not shown, {{null} is shown instead. If you check plugin file it does contain configuration information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7614) Add helpful tips to Testing.md
Arina Ielchiieva created DRILL-7614: --- Summary: Add helpful tips to Testing.md Key: DRILL-7614 URL: https://issues.apache.org/jira/browse/DRILL-7614 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Fix For: 1.18.0 Add some helpful tips to Testing.md, see PR for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: I want to subscribe to you
Please the following instructions to subscribe to the Drill mailing lists: https://drill.apache.org/mailinglists/ Kind regards, Arina On Wed, Feb 5, 2020 at 4:19 AM luoc wrote: > I want to subscribe to you
[jira] [Resolved] (DRILL-7459) Fetch size does not work on Postgres JDBC plugin
[ https://issues.apache.org/jira/browse/DRILL-7459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7459. - Resolution: Fixed Was fixed in the scope of DRILL-7467. The following plugin configuration should be used to support batches: {noformat} { "type": "jdbc", "driver": "org.postgresql.Driver", "url": "jdbc:postgresql://localhost:5959/my_db?defaultRowFetchSize=2", "username": "my_user", "password": "my_pass", "caseInsensitiveTableNames": false, "sourceParameters": { "defaultAutoCommit": false }, "enabled": true } {noformat} > Fetch size does not work on Postgres JDBC plugin > > > Key: DRILL-7459 > URL: https://issues.apache.org/jira/browse/DRILL-7459 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.15.0 >Reporter: Priyanka Bhoir >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.18.0 > > > To prevent the drill from going out of memory, it is suggested to set the > fetch size in Jdbc URL to enable data streaming([#DRILL-6794 | > https://issues.apache.org/jira/browse/DRILL-6794] discusses this). This does > not work on Postgres for the following reason: > For fetchSize size to work on Postgres, the connection must not be in > autocommit mode. There is no parameter to set autocommit to false in > connection string other than programmatically calling > conn.setAutoCommit(false). > See > [https://jdbc.postgresql.org/documentation/93/query.html#fetchsize-example] > See [https://jdbc.postgresql.org/documentation/head/connect.html] for the > list of all connection string properties. > Fix is to add a property 'defaultAutoCommit' to JdbcStorageConfig and call > BasicDataSource#setDefaultAutoCommit in JdbcStoragePlugin. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[ANNOUNCE] New PMC member: Bohdan Kazydub
I am pleased to announce that Drill PMC invited Bohdan Kazydub to the PMC and he has accepted the invitation. Congratulations Bohdan and welcome! - Arina (on behalf of Drill PMC)
[jira] [Created] (DRILL-7549) Fix validation error when querying absent folder in embedded mode
Arina Ielchiieva created DRILL-7549: --- Summary: Fix validation error when querying absent folder in embedded mode Key: DRILL-7549 URL: https://issues.apache.org/jira/browse/DRILL-7549 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 {noformat} apache drill> select * from dfs.tmp.`abc.parquet`; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object 'abc.parquet' not found within 'dfs.tmp' [Error Id: 0dad391e-ea4d-4d13-95e7-218dec865ad2 ] (state=,code=0) apache drill> select * from dfs.tmp.`abc/abc`; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object 'abc/abc' not found within 'dfs.tmp' [Error Id: 94ea83c7-4983-4958-8a82-eea49b58d4ed ] (state=,code=0) apache drill> use dfs.tmp; +--+-+ | ok | summary | +--+-+ | true | Default schema changed to [dfs.tmp] | +--+-+ 1 row selected (0.277 seconds) apache drill (dfs.tmp)> select * from `abc.parquet`; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 27: Object 'abc.parquet' not found [Error Id: d814df53-2a94-46da-81e5-f49a3d50a665 ] (state=,code=0) apache drill (dfs.tmp)> select * from `abc/abc`; Error: VALIDATION ERROR: null [Error Id: 6c11d397-f893-4ef6-9832-4a96a2029f7d ] (state=,code=0) {noformat} Full error: {noformat} Caused by: java.lang.IllegalArgumentException: at org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkArgument(Preconditions.java:121) at org.apache.drill.exec.store.sys.store.LocalPersistentStore.makePath(LocalPersistentStore.java:147) at org.apache.drill.exec.store.sys.store.LocalPersistentStore.get(LocalPersistentStore.java:166) at org.apache.drill.exec.store.sys.CaseInsensitivePersistentStore.get(CaseInsensitivePersistentStore.java:42) at org.apache.drill.exec.store.StoragePluginRegistryImpl.getPlugin(StoragePluginRegistryImpl.java:167) at org.apache.calcite.jdbc.DynamicRootSchema.loadSchemaFactory(DynamicRootSchema.java:80) at org.apache.calcite.jdbc.DynamicRootSchema.getImplicitSubSchema(DynamicRootSchema.java:67) at org.apache.calcite.jdbc.CalciteSchema.getSubSchema(CalciteSchema.java:265) at org.apache.calcite.sql.validate.EmptyScope.resolve_(EmptyScope.java:133) at org.apache.calcite.sql.validate.EmptyScope.resolveTable(EmptyScope.java:99) at org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203) at org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:105) at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:177) at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3129) at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3111) at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3383) at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969) at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:944) at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:651) at org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:189) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:648) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:196) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlH
[jira] [Resolved] (DRILL-7466) Configurable number of connection in JDBC storage connection pool
[ https://issues.apache.org/jira/browse/DRILL-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7466. - Resolution: Fixed Merged into master with commit id 50cd931da0364355535029db8e9d7a1445218803. > Configurable number of connection in JDBC storage connection pool > - > > Key: DRILL-7466 > URL: https://issues.apache.org/jira/browse/DRILL-7466 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JDBC >Affects Versions: 1.17.0 >Reporter: Priyanka Bhoir > Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.18.0 > > > Drill does not allow the number of connections in the pool to configured > When a JDBC storage plugin is created, it creates 8 connections per plugin > which happen to be the default for dbcp connection pool. Currently, there is > no way to configure these parameters using storage configuration. Max Idle is > set to 8 leaving all connections open even when unused. This situation > creates an unnecessary connection to DB. > These parameters must be made configuration in the JDBC storage plugin > configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6738) validationQuery option for JDBC storage plugin
[ https://issues.apache.org/jira/browse/DRILL-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6738. - Resolution: Fixed Merged into master with commit id 50cd931da0364355535029db8e9d7a1445218803. > validationQuery option for JDBC storage plugin > -- > > Key: DRILL-6738 > URL: https://issues.apache.org/jira/browse/DRILL-6738 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JDBC >Affects Versions: 1.14.0 > Environment: Apache Drill version 1.14.0 running on CentOS 7.0.1406. > MySQL version 5.5.43 running on CentOS 6.4. > MySQL connector/j version 5.1.44. >Reporter: Cheolgoo Kang >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.18.0 > > > Currently JDBC storage plugin uses DBCP2 for the connection pooling, but it > wouldn't validate the connection's still available or not, and it'd just spit > the connection lost error. > We should be able to have a validationQuery option described > [here|https://commons.apache.org/proper/commons-dbcp/configuration.html] in > Drill's JDBC storage plugin configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7398) dfs avro file support to treat non existent fields as null
[ https://issues.apache.org/jira/browse/DRILL-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7398. - Resolution: Fixed Merged into master with commit id bf7277c9c8725d6b9a56988f72c31ede1d486b85. > dfs avro file support to treat non existent fields as null > -- > > Key: DRILL-7398 > URL: https://issues.apache.org/jira/browse/DRILL-7398 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Avro >Reporter: Dan Schmitt >Priority: Major > > It would be nice to have an option to return null for non existing fields. > Currently we have a use case of evolving avro file schemas where things get > added. > It would be nice to be able to query old and new data without having to > rewrite the old data with a new schema with null/optional/default values. > A short term fix would be to provide an option to return null with dfs for > avro fields that aren't found in a file. > A nicer fix would be to support a resolving schema that can be given to dfs > to read the avro files to present a uniform interface (so aliases, defaults, > etc. could be utilized without rewriting each avro file.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7544) Upgrade Iceberg version to support Parquet 1.11.0
Arina Ielchiieva created DRILL-7544: --- Summary: Upgrade Iceberg version to support Parquet 1.11.0 Key: DRILL-7544 URL: https://issues.apache.org/jira/browse/DRILL-7544 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Upgrade Iceberg versions and load it using JitPack after Iceberg will support Parquet 1.11 (https://github.com/apache/incubator-iceberg/pull/708). Also remove workaround in ExpirationHandler since https://github.com/apache/incubator-iceberg/issues/181 is resolved. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7542) Fix Drill-on-Yarn logger
Arina Ielchiieva created DRILL-7542: --- Summary: Fix Drill-on-Yarn logger Key: DRILL-7542 URL: https://issues.apache.org/jira/browse/DRILL-7542 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0, 1.16.0 Reporter: Arina Ielchiieva Drill project uses Logback logger backed by SLF4J: {noformat} import org.slf4j.Logger; import org.slf4j.LoggerFactory; private static final Logger logger = LoggerFactory.getLogger(ResultsListener.class); {noformat} Drill-on-Yarn project uses commons loggin: {noformat} import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; private static final Log LOG = LogFactory.getLog(AbstractScheduler.class); {noformat} It would be nice if all project components used the same approach for logging. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7537) Convert Parquet format to EVF
Arina Ielchiieva created DRILL-7537: --- Summary: Convert Parquet format to EVF Key: DRILL-7537 URL: https://issues.apache.org/jira/browse/DRILL-7537 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7535) Convert Lstv to EVF
Arina Ielchiieva created DRILL-7535: --- Summary: Convert Lstv to EVF Key: DRILL-7535 URL: https://issues.apache.org/jira/browse/DRILL-7535 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7538) Convert MaprDB format to EVF
Arina Ielchiieva created DRILL-7538: --- Summary: Convert MaprDB format to EVF Key: DRILL-7538 URL: https://issues.apache.org/jira/browse/DRILL-7538 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7536) Convert Image format to EVF
Arina Ielchiieva created DRILL-7536: --- Summary: Convert Image format to EVF Key: DRILL-7536 URL: https://issues.apache.org/jira/browse/DRILL-7536 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7534) Convert Httpd to EVF
Arina Ielchiieva created DRILL-7534: --- Summary: Convert Httpd to EVF Key: DRILL-7534 URL: https://issues.apache.org/jira/browse/DRILL-7534 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Charles Givre -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7533) Convert Pcapng to EVF
Arina Ielchiieva created DRILL-7533: --- Summary: Convert Pcapng to EVF Key: DRILL-7533 URL: https://issues.apache.org/jira/browse/DRILL-7533 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Charles Givre -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7532) Convert SysLog to EVF
Arina Ielchiieva created DRILL-7532: --- Summary: Convert SysLog to EVF Key: DRILL-7532 URL: https://issues.apache.org/jira/browse/DRILL-7532 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Charles Givre Convert SysLog to EVF -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7531) Move format plugins to EVF
Arina Ielchiieva created DRILL-7531: --- Summary: Move format plugins to EVF Key: DRILL-7531 URL: https://issues.apache.org/jira/browse/DRILL-7531 Project: Apache Drill Issue Type: Improvement Reporter: Arina Ielchiieva Fix For: 1.18.0 This is umbrella Jira to track down process of moving format plugins to EVF. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7530) Fix class names in loggers
Arina Ielchiieva created DRILL-7530: --- Summary: Fix class names in loggers Key: DRILL-7530 URL: https://issues.apache.org/jira/browse/DRILL-7530 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Some loggers have incorrect class names which leads to incorrect information in logs. This Jira aims to fix all occurrences of incorrect class names in loggers. Preliminary list (some occurrences will be excluded): {noformat} Name: MapRDBTableCache.java. Expected: MapRDBTableCache. Got: MapRDBFormatPlugin. Name: MapRDBTableCache.java. Expected: MapRDBTableCache. Got: MapRDBFormatPlugin. Name: HiveFuncHolderExpr.java. Expected: HiveFuncHolderExpr. Got: DrillFuncHolderExpr. Name: HiveFuncHolder.java. Expected: HiveFuncHolder. Got: FunctionImplementationRegistry. Name: HiveMetadataProvider.java. Expected: HiveMetadataProvider. Got: HiveStats. Name: TableEntryCacheLoader.java. Expected: TableEntryCacheLoader. Got: TableNameCacheLoader. Name: TestKafkaSuit.java. Expected: TestKafkaSuit. Got: LoggerFactory. Name: DrillTestWrapper.java. Expected: DrillTestWrapper. Got: BaseTestQuery. Name: TestDisabledFunctionality.java. Expected: TestDisabledFunctionality. Got: TestExampleQueries. Name: TestMergeJoin.java. Expected: TestMergeJoin. Got: HashAggBatch. Name: TestLateralJoinCorrectnessBatchProcessing.java. Expected: TestLateralJoinCorrectnessBatchProcessing. Got: TestNewLateralJoinCorrectness. Name: TestOperatorRecordBatch.java. Expected: TestOperatorRecordBatch. Got: SubOperatorTest. Name: TestPauseInjection.java. Expected: TestPauseInjection. Got: DummyClass. Name: TestComplexTypeWriter.java. Expected: TestComplexTypeWriter. Got: TestComplexTypeReader. Name: AvgIntervalTypeFunctions.java. Expected: AvgIntervalTypeFunctions. Got: AvgFunctions. Name: SSLConfigBuilder.java. Expected: SSLConfigBuilder. Got: org.apache.drill.exec.ssl.SSLConfigBuilder. Name: PlannerPhase.java. Expected: PlannerPhase. Got: DrillRuleSets. Name: AbstractIndexDescriptor.java. Expected: AbstractIndexDescriptor. Got: AbstractIndexDescriptor . Name: CoveringPlanNoFilterGenerator.java. Expected: CoveringPlanNoFilterGenerator. Got: CoveringIndexPlanGenerator. Name: AbstractSqlSetHandler.java. Expected: AbstractSqlSetHandler. Got: AbstractSqlHandler. Name: HashJoinMemoryCalculatorImpl.java. Expected: HashJoinMemoryCalculatorImpl. Got: BuildSidePartitioning. Name: HashJoinMemoryCalculatorImpl.java. Expected: HashJoinMemoryCalculatorImpl. Got: PostBuildCalculationsImpl. Name: HashJoinMemoryCalculator.java. Expected: HashJoinMemoryCalculator. Got: PartitionStatSet. Name: NestedLoopJoinTemplate.java. Expected: NestedLoopJoinTemplate. Got: NestedLoopJoinBatch. Name: PartitionLimitRecordBatch.java. Expected: PartitionLimitRecordBatch. Got: LimitRecordBatch. Name: HashAggTemplate.java. Expected: HashAggTemplate. Got: HashAggregator. Name: SpilledRecordbatch.java. Expected: SpilledRecordbatch. Got: SimpleRecordBatch. Name: StreamingAggTemplate.java. Expected: StreamingAggTemplate. Got: StreamingAggregator. Name: SortMemoryManager.java. Expected: SortMemoryManager. Got: ExternalSortBatch. Name: SortConfig.java. Expected: SortConfig. Got: ExternalSortBatch. Name: SortImpl.java. Expected: SortImpl. Got: ExternalSortBatch. Name: SingleSenderCreator.java. Expected: SingleSenderCreator. Got: SingleSenderRootExec. Name: HashTableTemplate.java. Expected: HashTableTemplate. Got: HashTable. Name: FrameSupportTemplate.java. Expected: FrameSupportTemplate. Got: NoFrameSupportTemplate. Name: ScreenCreator.java. Expected: ScreenCreator. Got: ScreenRoot. Name: UnionAll.java. Expected: UnionAll. Got: Filter. Name: AvgIntervalTypeFunctions.java. Expected: AvgIntervalTypeFunctions. Got: AvgFunctions. Name: PersistedOptionValue.java. Expected: PersistedOptionValue. Got: Deserializer. Name: ThreadsResources.java. Expected: ThreadsResources. Got: MetricsResources. Name: RepeatedVarCharOutput.java. Expected: RepeatedVarCharOutput. Got: BaseFieldOutput. Name: MockSubScanPOP.java. Expected: MockSubScanPOP. Got: MockGroupScanPOP. Name: InMemoryStore.java. Expected: InMemoryStore. Got: InMemoryPersistentStore. Name: ParquetColumnChunkPageWriteStore.java. Expected: ParquetColumnChunkPageWriteStore. Got: ParquetDirectByteBufferAllocator. Name: CorrelationTypeFunctions.java. Expected: CorrelationTypeFunctions. Got: ${aggrtype.className}Functions. Name: MathFunctionTemplates.java. Expected: MathFunctionTemplates. Got: ${inputType.className}Functions. Name: CastHigh.java. Expected: CastHigh. Got: CastHighFunctions. Name: IntervalAggrFunctions2.java. Expected: IntervalAggrFunctions2. Got: ${aggrtype.className}Functions. Name: SumZeroAggr.java. Expected: SumZeroAggr. Got: SumZeroFunctions. Name: NumericFunctionsTemplates.java
[jira] [Resolved] (DRILL-2873) CTAS reports error when timestamp values in CSV file are quoted
[ https://issues.apache.org/jira/browse/DRILL-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-2873. - Resolution: Fixed > CTAS reports error when timestamp values in CSV file are quoted > --- > > Key: DRILL-2873 > URL: https://issues.apache.org/jira/browse/DRILL-2873 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 0.9.0 > Environment: 64e3ec52b93e9331aa5179e040eca19afece8317 | DRILL-2611: > value vectors should report valid value count | 16.04.2015 @ 13:53:34 EDT >Reporter: Khurram Faraaz >Priority: Major > Fix For: 1.17.0 > > > When timestamp values are quoted in quotes (") inside a CSV data file, CTAS > statement reports error. > Failing CTAS > {code} > 0: jdbc:drill:> create table prqFrmCSV02 as select cast(columns[0] as int) > col_int, cast(columns[1] as bigint) col_bgint, cast(columns[2] as char(10)) > col_char, cast(columns[3] as varchar(18)) col_vchar, cast(columns[4] as > timestamp) col_tmstmp, cast(columns[5] as date) col_date, cast(columns[6] as > boolean) col_boln, cast(columns[7] as double) col_dbl from `csvToPrq.csv`; > Query failed: SYSTEM ERROR: Invalid format: ""2015-04-23 23:47:00.124"" > [a601a66a-b305-4a92-9836-f39edcdc8fe8 on centos-02.qa.lab:31010] > Error: exception while executing query: Failure while executing query. > (state=,code=0) > {code} > Stack trace from drillbit.log > {code} > 2015-04-24 18:41:09,721 [2ac571ba-778f-f3d5-c60f-af2e536905a3:frag:0:0] ERROR > o.a.drill.exec.ops.FragmentContext - Fragment Context received failure -- > Fragment: 0:0 > org.apache.drill.common.exceptions.DrillUserException: SYSTEM ERROR: Invalid > format: ""2015-04-23 23:47:00.124"" > [a601a66a-b305-4a92-9836-f39edcdc8fe8 on centos-02.qa.lab:31010] > at > org.apache.drill.common.exceptions.DrillUserException$Builder.build(DrillUserException.java:115) > ~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.common.exceptions.ErrorHelper.wrap(ErrorHelper.java:39) > ~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:151) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:131) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:74) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:76) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:64) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:164) > ~
[jira] [Created] (DRILL-7528) Update Avro format plugin documentation
Arina Ielchiieva created DRILL-7528: --- Summary: Update Avro format plugin documentation Key: DRILL-7528 URL: https://issues.apache.org/jira/browse/DRILL-7528 Project: Apache Drill Issue Type: Task Reporter: Arina Ielchiieva Currently documentation states that Avro plugin is experimental. As per Drill 1.17 / 1.18 it's code is pretty stable (since Drill 1.18 it uses EVF). Documentation should be updated accordingly. https://drill.apache.org/docs/querying-avro-files/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5024) CTAS with LIMIT 0 query in SELECT stmt does not create parquet file
[ https://issues.apache.org/jira/browse/DRILL-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5024. - Resolution: Fixed > CTAS with LIMIT 0 query in SELECT stmt does not create parquet file > --- > > Key: DRILL-5024 > URL: https://issues.apache.org/jira/browse/DRILL-5024 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.8.0 >Reporter: Khurram Faraaz >Priority: Major > Fix For: 1.17.0 > > > Note that CTAS was successful > {noformat} > 0: jdbc:drill:schema=dfs.tmp> create table regtbl_w0rows as select * from > typeall_l LIMIT 0; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 0 | > +---++ > 1 row selected (0.51 seconds) > {noformat} > But a SELECT on CTAS created file fails. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from regtbl_w0rows; > Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 27: Table > 'regtbl_w0rows' not found > SQL Query null > [Error Id: 0569cf98-3800-43ee-b635-aa101b016d46 on centos-01.qa.lab:31010] > (state=,code=0) > {noformat} > DROP on the CTAS created table also fails > {noformat} > 0: jdbc:drill:schema=dfs.tmp> drop table regtbl_w0rows; > Error: VALIDATION ERROR: Table [regtbl_w0rows] not found > [Error Id: fb0b1ea8-f76d-42e2-b69c-4beae2798bdf on centos-01.qa.lab:31010] > (state=,code=0) > 0: jdbc:drill:schema=dfs.tmp> > {noformat} > Verified that CTAS did not create a physical file in dfs.tmp schema > {noformat} > [test@cent01 bin]# hadoop fs -ls /tmp/regtbl_w0rows > ls: `/tmp/regtbl_w0rows': No such file or directory > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-3916) Assembly for JDBC storage plugin missing
[ https://issues.apache.org/jira/browse/DRILL-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-3916. - Resolution: Fixed > Assembly for JDBC storage plugin missing > > > Key: DRILL-3916 > URL: https://issues.apache.org/jira/browse/DRILL-3916 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC, Storage - Other >Affects Versions: 1.2.0 >Reporter: Andrew >Assignee: Andrew >Priority: Major > > The JDBC storage plugin is missing from the assembly instructions, which > means that the plugin fails to be loaded by the drill bit on start. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7526) Assertion Error when only type with used with schema in table function
Arina Ielchiieva created DRILL-7526: --- Summary: Assertion Error when only type with used with schema in table function Key: DRILL-7526 URL: https://issues.apache.org/jira/browse/DRILL-7526 Project: Apache Drill Issue Type: Bug Reporter: Arina Ielchiieva {{org.apache.drill.TestSchemaWithTableFunction}} {noformat} @Test public void testWithTypeAndSchema() { String query = "select Year from table(dfs.`store/text/data/cars.csvh`(type=> 'text', " + "schema=>'inline=(`Year` int)')) where Make = 'Ford'"; queryBuilder().sql(query).print(); } {noformat} {noformat} Caused by: java.lang.AssertionError: BOOLEAN at org.apache.calcite.sql.type.SqlTypeExplicitPrecedenceList.compareTypePrecedence(SqlTypeExplicitPrecedenceList.java:140) at org.apache.calcite.sql.SqlUtil.bestMatch(SqlUtil.java:687) at org.apache.calcite.sql.SqlUtil.filterRoutinesByTypePrecedence(SqlUtil.java:656) at org.apache.calcite.sql.SqlUtil.lookupSubjectRoutines(SqlUtil.java:515) at org.apache.calcite.sql.SqlUtil.lookupRoutine(SqlUtil.java:435) at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:240) at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:218) at org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5640) at org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5627) at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139) at org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1692) at org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl(ProcedureNamespace.java:53) at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3129) at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3111) at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3383) at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969) at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:944) at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:651) at org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:189) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:648) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:196) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:170) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163) at org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:590) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:275) ... 1 more {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7525) Convert SequqenceFiles to EVF
Arina Ielchiieva created DRILL-7525: --- Summary: Convert SequqenceFiles to EVF Key: DRILL-7525 URL: https://issues.apache.org/jira/browse/DRILL-7525 Project: Apache Drill Issue Type: Improvement Reporter: Arina Ielchiieva Convert SequqenceFiles to EVF -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7504) Upgrade Parquet library to 1.11.0
Arina Ielchiieva created DRILL-7504: --- Summary: Upgrade Parquet library to 1.11.0 Key: DRILL-7504 URL: https://issues.apache.org/jira/browse/DRILL-7504 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Fix For: 1.18.0 Upgrade Parquet library to 1.11.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[ANNOUNCE] New Committer: Denys Ordynskiy
The Project Management Committee (PMC) for Apache Drill has invited Denys Ordynskiy to become a committer, and we are pleased to announce that he has accepted. Denys has been contributing to Drill for more than a year. He did many contributions as a QA, he found, tested and verified important bugs and features. Recently he has actively participated in Hadoop 3 migration verification and actively tested current and previous releases. He also contributed into drill-test-framework to automate Drill tests. Welcome Denys, and thank you for your contributions! - Arina (on behalf of Drill PMC)
[jira] [Created] (DRILL-7497) Fix warnings when starting Drill on Windows using Java 11
Arina Ielchiieva created DRILL-7497: --- Summary: Fix warnings when starting Drill on Windows using Java 11 Key: DRILL-7497 URL: https://issues.apache.org/jira/browse/DRILL-7497 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Vova Vysotskyi Fix For: 1.18.0 Warnings are displayed in SqlLine when starting Drill in embedded mode on Windows using Java 11: {noformat} WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by javassist.util.proxy.SecurityActions (file:/C:/drill_1_17/apache-drill-1.17.0/apache-drill-1.17.0/jars/3rdparty/javassist-3.26.0-GA.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) WARNING: Please consider reporting this to the maintainers of javassist.util.proxy.SecurityActions WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7492) Do not display Failure node entry if it is not set
Arina Ielchiieva created DRILL-7492: --- Summary: Do not display Failure node entry if it is not set Key: DRILL-7492 URL: https://issues.apache.org/jira/browse/DRILL-7492 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Arina Ielchiieva https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/rest/profile/profile.ftl#L223 {{errorNode}} is not set in most the times, we should not display {{Failure node}} entry is its value is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7489) NPE for star query with alias
Arina Ielchiieva created DRILL-7489: --- Summary: NPE for star query with alias Key: DRILL-7489 URL: https://issues.apache.org/jira/browse/DRILL-7489 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Igor Guzenko Query where alias is used for star should throw exception. Query on table with defined schema returns correct exception: {noformat} select * col_alias from sys.version; org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: At line 1, column 8: Unknown identifier '*' {noformat} Query on table with dynamic schema returns NPE: {noformat} select * col_alias from cp.`tpch/nation.parquet` Caused by: java.lang.NullPointerException: at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3690) at org.apache.calcite.sql2rel.SqlToRelConverter.access$2200(SqlToRelConverter.java:217) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4765) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4061) at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:317) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4625) at org.apache.calcite.sql2rel.StandardConvertletTable.lambda$new$9(StandardConvertletTable.java:204) at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4756) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4061) at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4625) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3908) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:670) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:627) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3150) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:563) at org.apache.drill.exec.planner.sql.SqlConverter.toRel(SqlConverter.java:381) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:685) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:202) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:172) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:282) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:162) at org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:139) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:92) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:590) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:275) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7488) Complex queries on INFORMATION_SCHEMA fail when `planner.slice_target`is small
Arina Ielchiieva created DRILL-7488: --- Summary: Complex queries on INFORMATION_SCHEMA fail when `planner.slice_target`is small Key: DRILL-7488 URL: https://issues.apache.org/jira/browse/DRILL-7488 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Set small value for slice_target: {code:sql} alter session set `planner.slice_target`=1; {code} Run query on INFORMATION_SCHEMA: {code:sql} select * from information_schema.`tables` where TABLE_NAME='lineitem' order by TABLE_NAME; {code} It will fail with the following exception: {noformat} java.lang.Exception: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: Schema tree can only be created in root fragment. This is a non-root fragment. {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7482) Fix "could not find artifact" warning Drill build
Arina Ielchiieva created DRILL-7482: --- Summary: Fix "could not find artifact" warning Drill build Key: DRILL-7482 URL: https://issues.apache.org/jira/browse/DRILL-7482 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Vova Vysotskyi Fix For: 1.17.0 Fix the following warning during Drill build: Could not find artifact org.glassfish:javax.el:pom:3.0.1-b07-SNAPSHOT in sonatype-nexus-snapshots (https://oss.sonatype.org/content/repositories/snapshots) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7481) Fix raw type warnings in Iceberg Metastore
Arina Ielchiieva created DRILL-7481: --- Summary: Fix raw type warnings in Iceberg Metastore Key: DRILL-7481 URL: https://issues.apache.org/jira/browse/DRILL-7481 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Fix raw type warnings in Iceberg Metastore module and related classes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7476) Info in some sys schema tables are missing if queried with limit clause
Arina Ielchiieva created DRILL-7476: --- Summary: Info in some sys schema tables are missing if queried with limit clause Key: DRILL-7476 URL: https://issues.apache.org/jira/browse/DRILL-7476 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Affected schema: sys Affected tables: connections, threads, memory If query is executed with limit clause, information for some fields are missing: *Connections* {noformat} apache drill (sys)> select * from connections; +---+---++-+---+-+-+-+--+--+ | user|client | drillbit | established | duration | queries | isAuthenticated | isEncrypted | usingSSL | session| +---+---++-+---+-+-+-+--+--+ | anonymous | xxx.xxx.x.xxx | xxx | 2019-12-10 13:45:01.766 | 59 min 42.393 sec | 27 | false | false | false| xxx | +---+---++-+---+-+-+-+--+--+ 1 row selected (0.1 seconds) apache drill (sys)> select * from connections limit 1; +--++--+-+--+-+-+-+--+-+ | user | client | drillbit | established | duration | queries | isAuthenticated | isEncrypted | usingSSL | session | +--++--+-+--+-+-+-+--+-+ | || | 2019-12-10 13:45:01.766 | | 28 | false | false | false| | +--++--+-+--+-+-+-+--+-+ {noformat} *Threads* {noformat} apache drill (sys)> select * from threads; ++---+---+--+ | hostname | user_port | total_threads | busy_threads | ++---+---+--+ | xxx | 31010 | 27| 23 | ++---+---+--+ 1 row selected (0.119 seconds) apache drill (sys)> select * from threads limit 1; +--+---+---+--+ | hostname | user_port | total_threads | busy_threads | +--+---+---+--+ | | 31010 | 27| 24 | {noformat} *Memory* {noformat} apache drill (sys)> select * from memory; ++---+--+++++ | hostname | user_port | heap_current | heap_max | direct_current | jvm_direct_current | direct_max | ++---+--+++++ | xxx | 31010 | 493974480| 4116185088 | 5048576| 122765 | 8589934592 | ++---+--+++++ 1 row selected (0.115 seconds) apache drill (sys)> select * from memory limit 1; +--+---+--+++++ | hostname | user_port | heap_current | heap_max | direct_current | jvm_direct_current | direct_max | +--+---+--+++++ | | 31010 | 499343272| 4116185088 | 9048576| 122765 | 8589934592 | +--+---+--+++++ {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7454) Convert the Avro format plugin plugin to use EVF
Arina Ielchiieva created DRILL-7454: --- Summary: Convert the Avro format plugin plugin to use EVF Key: DRILL-7454 URL: https://issues.apache.org/jira/browse/DRILL-7454 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.18.0 Convert the Avro format plugin plugin to use EVF. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6723) Kafka reader fails on malformed JSON
[ https://issues.apache.org/jira/browse/DRILL-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6723. - Resolution: Fixed Merged with commit id ffd3c352586a5884747edcc5a93b0c625a47e100 in the scope of DRILL-7388. > Kafka reader fails on malformed JSON > > > Key: DRILL-6723 > URL: https://issues.apache.org/jira/browse/DRILL-6723 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.14.0 > Environment: java version "1.8.0_91" > Java(TM) SE Runtime Environment (build 1.8.0_91-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode) > $ bin/drill-embedded > Aug 30, 2018 5:29:08 PM org.glassfish.jersey.server.ApplicationHandler > initialize > INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 > 01:25:26... > apache drill 1.14.0 > >Reporter: Matt Keranen >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > With a Kakfa topic where the first message is a simple string "test", and > store.json.reader.skip_invalid_records = true, a SELECT * FROM topic still > fails following error. > In addition, using an OFFSET does not appear to allow the bad messages to be > bypassed. Same error occurs on the first message. > {noformat} > 0: jdbc:drill:zk=local> select * from kafka.`logs` limit 5; > Error: DATA_READ ERROR: Failure while reading messages from kafka. > Recordreader was at record: 1 > Not a JSON Object: "TEST" > Fragment 0:0 > [Error Id: 965d7a69-3d77-4a11-9613-3892a95c4a63 on x.x.x.x:31010] > (state=,code=0) > {noformat} > Description: > New option {{store.kafka.reader.skip_invalid_records}} will be introduced to > cover this case. Default is false. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6739) Update Kafka libs to 2.0.0+ version
[ https://issues.apache.org/jira/browse/DRILL-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6739. - Resolution: Fixed Merged with commit id ffd3c352586a5884747edcc5a93b0c625a47e100 in the scope of DRILL-7388. > Update Kafka libs to 2.0.0+ version > --- > > Key: DRILL-6739 > URL: https://issues.apache.org/jira/browse/DRILL-6739 > Project: Apache Drill > Issue Type: Task > Components: Storage - Kafka >Affects Versions: 1.14.0 >Reporter: Vitalii Diravka > Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > The current version of Kafka libs is 0.11.0.1 > The last version is 2.0.0 (September 2018) > https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients > Looks like the only changes which should be done are: > * replacing {{serverConfig()}} method with {{staticServerConfig()}} in Drill > {{EmbeddedKafkaCluster}} class > * Replacing deprecated {{AdminUtils}} with {{kafka.zk.AdminZkClient}} > [https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/core/src/main/scala/kafka/admin/AdminUtils.scala#L35] > https://issues.apache.org/jira/browse/KAFKA-6545 > The initial work: https://github.com/vdiravka/drill/commits/DRILL-6739 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7290) “Failed to construct kafka consumer” using Apache Drill
[ https://issues.apache.org/jira/browse/DRILL-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7290. - Resolution: Fixed Merged with commit id ffd3c352586a5884747edcc5a93b0c625a47e100 in the scope of DRILL-7388. > “Failed to construct kafka consumer” using Apache Drill > --- > > Key: DRILL-7290 > URL: https://issues.apache.org/jira/browse/DRILL-7290 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.14.0, 1.16.0 >Reporter: Aravind Voruganti > Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > {noformat} > {noformat} > I am using the Apache Drill (1.14) JDBC driver in my application which > consumes the data from the Kafka. The application works just fine for some > time and after few iterations it fails to execute due to the following *Too > many files open* issue. I made sure there are no file handle leaks in my code > but still nor sure why this issue is happening? > > It looks like the issue is happening from with-in the Apache drill libraries > when constructing the Kafka consumer. Can any one please guide me help this > problem fixed? > The problem perishes when I restart my Apache drillbit but very soon it > happens again. I did check the file descriptor count on my unix machine using > *{{ulimit -a | wc -l}} & {{lsof -a -p | wc -l}}* before and after the > drill process restart and it seems the drill process is considerably taking a > lot of file descriptors. I tried increasing the file descriptor count on the > system but still no luck. > I have followed the Apache Drill storage plugin documentation in configuring > the Kafka plugin into Apache Drill at > [https://drill.apache.org/docs/kafka-storage-plugin/] > Any help on this issue is highly appreciated. Thanks. > JDBC URL: *{{jdbc:drill:drillbit=localhost:31010;schema=kafka}}* > NOTE: I am pushing down the filters in my query {{SELECT * FROM myKafkaTopic > WHERE kafkaMsgTimestamp > 1560210931626}} > > 2019-06-11 08:43:13,639 [230033ed-d410-ae7c-90cb-ac01d3b404cc:foreman] INFO > o.a.d.e.store.kafka.KafkaGroupScan - User Error Occurred: Failed to fetch > start/end offsets of the topic myKafkaTopic (Failed to construct kafka > consumer) > org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failed to > fetch start/end offsets of the topic myKafkaTopic > Failed to construct kafka consumer > [Error Id: 73f896a7-09d4-425b-8cd5-f269c3a6e69a ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.store.kafka.KafkaGroupScan.init(KafkaGroupScan.java:198) > [drill-storage-kafka-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.store.kafka.KafkaGroupScan.(KafkaGroupScan.java:98) > [drill-storage-kafka-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.store.kafka.KafkaStoragePlugin.getPhysicalScan(KafkaStoragePlugin.java:83) > [drill-storage-kafka-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:111) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:99) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.logical.DrillScanRel.(DrillScanRel.java:89) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.logical.DrillScanRel.(DrillScanRel.java:69) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.logical.DrillScanRel.(DrillScanRel.java:62) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.logical.DrillScanRule.onMatch(DrillScanRule.java:38) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212) > [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:652) > [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368) > [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:429) > [drill-java-exec-1.14.0.jar:1.14.0] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:3
[jira] [Created] (DRILL-7448) Fix warnings when running Drill memory tests
Arina Ielchiieva created DRILL-7448: --- Summary: Fix warnings when running Drill memory tests Key: DRILL-7448 URL: https://issues.apache.org/jira/browse/DRILL-7448 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Bohdan Kazydub Fix For: 1.17.0 {noformat} -- drill-memory-base [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.drill.exec.memory.TestEndianess [INFO] Running org.apache.drill.exec.memory.TestAccountant 16:21:45,719 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy] 16:21:45,719 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback-test.xml] at [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml] 16:21:45,733 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@dbd940d - URL [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml] is not of type file 16:21:45,780 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 16:21:45,802 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 16:21:45,802 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 16:21:45,803 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 16:21:45,811 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 16:21:45,826 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 16:21:45,866 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT level set to ERROR 16:21:45,866 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 16:21:45,866 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 16:21:45,866 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - The object on the top the of the stack is not the root logger 16:21:45,866 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - It is: ch.qos.logback.core.joran.conditional.IfAction 16:21:45,866 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration. 16:21:45,867 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@71d15f18 - Registering current configuration as safe fallback point 16:21:45,717 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy] 16:21:45,717 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback-test.xml] at [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml] 16:21:45,729 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@2698dc7 - URL [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml] is not of type file 16:21:45,778 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 16:21:45,807 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 16:21:45,807 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 16:21:45,808 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 16:21:45,814 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 16:21:45,829 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 16:21:45,868 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT level set to ERROR 16:21:45,868 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 16:21:45,868 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 16:21:45,868 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - The object on the top the of the stack is not the root logger 16:21:45,868 |-WARN
Re: [DISCUSS] 1.17.0 release
Hi Charles. Hadoop update is under dev and testing, if Anton / Denys will be able to finish thus task before release, we will include it. Though preliminary, we are targeting this feature for the 1.18 release. Kind regards, Arina On Tue, Nov 5, 2019 at 4:56 PM Charles Givre wrote: > Volodymyr, > What I was getting at was that this PR has a bunch of tasks which are > blocking it. Some seemed relatively simple such as removing comments from > the pom.xml file. Could we quickly include any of these tasks so that we > can get this resolved quickly for v 1.18? > --C > > > On Nov 5, 2019, at 9:40 AM, Volodymyr Vysotskyi > wrote: > > > > I agree with you that this Jira is an important one, but I don't think > that > > PR for DRILL-6540 will be included in this release > > since the PR itself is not completed and there are a lot of things which > > should be checked before merging this PR. > > > > I think it is better to have a stable version with known limitations than > > an unchecked new one. > > But we definitely should include it to the next 1.18.0 release. > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Tue, Nov 5, 2019 at 3:46 PM Charles Givre wrote: > > > >> One other question... > >> Are there any parts of Dril-6540 that we could include in version 1.17? > >> IMHO, this is an important one. > >> > >> Regards, > >> -- C > >> > >>> On Nov 4, 2019, at 2:08 PM, Vova Vysotskyi wrote: > >>> > >>> Hi Charles, > >>> > >>> Thanks for pointing to this Jira. I'm not sure that we can update most > of > >>> the libraries pointed there considering current project dependencies. > >>> I'll add a comment to this Jira ticket with my thoughts. > >>> > >>> Kind regards, > >>> Volodymyr Vysotskyi > >>> > >>> > >>> On Mon, Nov 4, 2019 at 8:59 PM Charles Givre wrote: > >>> > Hi Volodymyr > I'd like to see if we can get some or all of DRILL-7416 in as well. > >> This > is a security update which is important IMHO. > Thanks, > -- C > > > On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi < > volody...@apache.org> > wrote: > > > > Hello Drillers, > > > > It's about 6 months have passed since the previous release and its > time > to > > discuss and start planning for the 1.17.0. > > I volunteer to manage the new release. > > > > We have 6 Jira tickets with "reviewable" status, 2 "in progress" and > 6 > open > > tickets [1]. Jira tickets marked as ready to commit will be merged > soon > and > > I hope other PRs from this list will be completed before the cut-off > date. > > > > Among these tickets, I want to include DRILL-7273 [2] to the release > (pull > > request is already opened, but CR comments should be addressed). > > > > I would like to propose a preliminary release cut-off date as the > >> middle > of > > the next week (Nov, 13) or the beginning of the week after that (Nov, > 18). > > > > Please let me know if there are any other Jira tickets you working on > which > > should be included in this release. > > > > [1] > > > > >> > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 > > [2] https://issues.apache.org/jira/browse/DRILL-7273 > > > > Kind regards, > > Volodymyr Vysotskyi > > > >> > >> > >
[jira] [Resolved] (DRILL-5506) Apache Drill Querying data from compressed .zip file
[ https://issues.apache.org/jira/browse/DRILL-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5506. - Resolution: Fixed Fixed in the scope of DRILL-5674. > Apache Drill Querying data from compressed .zip file > > > Key: DRILL-5506 > URL: https://issues.apache.org/jira/browse/DRILL-5506 > Project: Apache Drill > Issue Type: Task > Components: Functions - Drill >Affects Versions: 1.10.0 >Reporter: john li >Priority: Major > Labels: newbie > Fix For: 1.17.0 > > > Referring to the previous issue > https://issues.apache.org/jira/browse/DRILL-2806 > According to the remarks from Steven Phillips added a comment - 16/Apr/15 > 21:50 > "The only compression codecs that work with Drill out of the box are gz, and > bz2. Additional codecs can be added by including the relevant libraries in > the Drill classpath." > I would like to learn how to use Apache Drill to query data from compressed > .zip file. > > However , the only default compression codecs that work with Apache Drill are > gz, and bz2. > > Assuming that Additional codecs can be added by including the relevant > libraries in the Drill classpath. > > Please kindly show me the step by step instructions so that I can understand > how exactly to add the "zip" codec and how to include the relevant libraries > in the Drill classpath ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7419) Enhance Drill splitting logic for compressed files
Arina Ielchiieva created DRILL-7419: --- Summary: Enhance Drill splitting logic for compressed files Key: DRILL-7419 URL: https://issues.apache.org/jira/browse/DRILL-7419 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.16.0 Reporter: Arina Ielchiieva By default Drill treats all compressed files are non splittable. Drill uses BlockMapBuilder to split file into blocks if possible. According to its code, it tries to split the file if blockSplittable is set to true and file IS NOT compressed. So even if format is block splittable but came as compressed file, it won't be split. https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java#L115 But some compression codecs can be splittable, for example; bzip2 (https://i.stack.imgur.com/jpprr.jpg). Codec type should be taken into account when considering if file can be split. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5436) Need a way to input password which contains space when calling sqlline
[ https://issues.apache.org/jira/browse/DRILL-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5436. - Resolution: Fixed Fixed in the scope of https://issues.apache.org/jira/browse/DRILL-7401. > Need a way to input password which contains space when calling sqlline > -- > > Key: DRILL-5436 > URL: https://issues.apache.org/jira/browse/DRILL-5436 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.10.0 >Reporter: Hao Zhu >Priority: Major > Fix For: 1.17.0 > > > create a user named "spaceuser" with password "hello world". > All below failed: > {code} > sqlline -u jdbc:drill:zk=xxx -n spaceuser -p 'hello world' > sqlline -u jdbc:drill:zk=xxx -n spaceuser -p "hello world" > sqlline -u jdbc:drill:zk=xxx -n spaceuser -p 'hello\ world' > sqlline -u jdbc:drill:zk=xxx -n spaceuser -p "hello\ world" > {code} > Need a way to input password which contains space when calling sqlline -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7418) MetadataDirectGroupScan improvements
Arina Ielchiieva created DRILL-7418: --- Summary: MetadataDirectGroupScan improvements Key: DRILL-7418 URL: https://issues.apache.org/jira/browse/DRILL-7418 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 When count is converted to direct scan (case when statistics and table metadata are available and there is no need to perform count operation), {{MetadataDirectGroupScan}} is used. Proposed {{MetadataDirectGroupScan}} enhancements: 1. show table root instead listing all table files. If users= has lots of files, query plan gets polluted with files enumeration. Since files are not used for calculation (only metadata), they are not relevant and can be excluded from plan. Before: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3]) 00-02DirectScan(groupscan=[files = [/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_0.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_5.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_4.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_9.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_3.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_6.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_7.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_10.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_2.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_1.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_8.parquet], numFiles = 11, usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 2880404, 2880404, 0]]}]) {noformat} After: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3]) 00-02DirectScan(groupscan=[selectionRoot = /drill/testdata/metadata_cache/store_sales_null_blocks_all, numFiles = 11, usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 2880404, 2880404, 0]]}]) {noformat} 2. Submission of physical plan which contains {{MetadataDirectGroupScan}} fails with deserialization errors, proper ser / de should be implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6842) Export to CSV using CREATE TABLE AS (CTAS) wrong parsed
[ https://issues.apache.org/jira/browse/DRILL-6842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6842. - Resolution: Fixed Fixed in the scope of DRILL-6096. > Export to CSV using CREATE TABLE AS (CTAS) wrong parsed > --- > > Key: DRILL-6842 > URL: https://issues.apache.org/jira/browse/DRILL-6842 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Text CSV, Storage - Writer >Affects Versions: 1.14.0 > Environment: - Tested with latest version *Apache Drill* 1.14.0, and > building the latest version from master (Github repo), commit > ad61c6bc1dd24994e50fe7dfed043d5e57dba8f9 at _Nov 5, 2018_. > - *Linux* x64, Ubuntu 16.04 > - *OpenJDK* Runtime Environment (build > 1.8.0_171-8u171-b11-0ubuntu0.17.10.1-b11) > - Apache *Maven* 3.5.0 >Reporter: Mariano Ruiz >Priority: Minor > Labels: csv, export > Fix For: 1.17.0 > > Attachments: Screenshot from 2018-11-09 14-18-43.png > > > When you export to a CSV using CTAS the result of a query, most of the time > the generated file is OK, but if you have in the results text columns with > "," characters, the resulting CSV file is broken, because does not enclose > the cells with commas inside with the " character. > Steps to reproduce the bug: > Lets say you have the following table in some source of data, maybe a CSV > file too: > {code:title=/tmp/input.csv} > product_ean,product_name,product_brand > 12345678900,IPhone X,Apple > 9911100,"Samsung S9, Black",Samsung > 1223456,Smartwatch XY,Some Brand > {code} > Note that the second row of data, in the column "product_name", it has a > value with a comma inside (_Samsung S9, Black_), so all the cell value is > enclosed with " characters, while the rest of the column cells aren't, > despite they could be enclosed too. > So if you query this file, Drill will interpret correctly the file and does > not interpret that comma inside the cell as a separator like the rest of the > commas in the file: > {code} > 0: jdbc:drill:zk=local> SELECT * FROM dfs.`/tmp/input.csv`; > +--+++ > | product_ean |product_name| product_brand | > +--+++ > | 12345678900 | IPhone X | Apple | > | 9911100 | Samsung S9, Black | Samsung| > | 1223456 | Smartwatch XY | Some Brand | > +--+++ > 3 rows selected (1.874 seconds) > {code} > But now, if you want to query the file and export the result as CSV using the > CTAS feature, using the following steps: > {code} > 0: jdbc:drill:zk=local> USE dfs.tmp; > +---+--+ > | ok | summary| > +---+--+ > | true | Default schema changed to [dfs.tmp] | > +---+--+ > 1 row selected (0.13 seconds) > 0: jdbc:drill:zk=local> ALTER SESSION SET `store.format`='csv'; > +---++ > | ok |summary | > +---++ > | true | store.format updated. | > +---++ > 1 row selected (0.094 seconds) > 0: jdbc:drill:zk=local> CREATE TABLE dfs.tmp.my_output AS SELECT * FROM > dfs.`/tmp/input.csv`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 3 | > +---++ > 1 row selected (0.453 seconds) > {code} > The output file is this: > {code:title=/tmp/my_output/0_0_0.csv} > product_ean,product_name,product_brand > 12345678900,IPhone X,Apple > 9911100,Samsung S9, Black,Samsung > 1223456,Smartwatch XY,Some Brand > {code} > The text _Samsung S9, Black_ in the cell is not quoted, so any CSV > interpreter like an office tool, a Java/Python/... library will interpret it > as two cell instead of one. Even Apache Drill will interpret it wrong: > {code} > 0: jdbc:drill:zk=local> SELECT * FROM dfs.`/tmp/my_output/0_0_0.csv`; > +--+++ > | product_ean | product_name | product_brand | > +--+++ > | 12345678900 | IPhone X | Apple | > | 9911100 | Samsung S9 | Black
[jira] [Resolved] (DRILL-4788) Exporting from Parquet to CSV - commas in strings are not escaped
[ https://issues.apache.org/jira/browse/DRILL-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-4788. - Resolution: Fixed Fixed in the scope of DRILL-6096. > Exporting from Parquet to CSV - commas in strings are not escaped > - > > Key: DRILL-4788 > URL: https://issues.apache.org/jira/browse/DRILL-4788 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.6.0 > Environment: Linux >Reporter: Richard Patching >Priority: Major > Labels: csv, csvparser, export > Fix For: 1.17.0 > > > When exporting data from Parquet to CSV, if there is a column which contains > a comma, the text after the comma gets put into the next column instead of > being escaped. > The only work around is to do REGEXP_REPLACE(COLUMN[0], ',',' ') which > replaced the comma in the string with a blank space. This is not ideal in > terms of keeping a true accurate record of the data we receive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6958) CTAS csv with option
[ https://issues.apache.org/jira/browse/DRILL-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6958. - Resolution: Fixed Fixed in the scope of DRILL-6096. > CTAS csv with option > > > Key: DRILL-6958 > URL: https://issues.apache.org/jira/browse/DRILL-6958 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text CSV >Affects Versions: 1.15.0, 1.16.0 >Reporter: benj >Priority: Major > Fix For: 1.17.0 > > > Currently, it may be difficult to produce well-formed CSV with CTAS (see > comment below). > It appears necessary to have some additional/configuratble options to write > CSV file with CTAS : > * possibility to change/define the separator, > * possibility to write or not the header, > * possibility to force the write of only 1 file instead of lot of parts, > * possibility to force quoting > * possibility to use/change escape char > * ... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-3850) Execute multiple commands from sqlline -q
[ https://issues.apache.org/jira/browse/DRILL-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-3850. - Resolution: Fixed Fixed in the scope of DRILL-7401. > Execute multiple commands from sqlline -q > - > > Key: DRILL-3850 > URL: https://issues.apache.org/jira/browse/DRILL-3850 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.1.0, 1.2.0 > Environment: Mint 17.1 >Reporter: Philip Deegan >Priority: Major > Fix For: 1.17.0 > > > Be able to perform > {noformat} > ./sqlline -u jdbc:drill:zk=local -q "use dfs.tmp; alter session set > \`store.format\`='csv';" > {noformat} > instead of > {noformat} > ./sqlline -u jdbc:drill:zk=local -q "use dfs.tmp;" > ./sqlline -u jdbc:drill:zk=local -q "alter session set > \`store.format\`='csv';" > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7195) Query returns incorrect result or does not fail when cast with is null is used in filter condition
[ https://issues.apache.org/jira/browse/DRILL-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-7195. - Resolution: Fixed > Query returns incorrect result or does not fail when cast with is null is > used in filter condition > -- > > Key: DRILL-7195 > URL: https://issues.apache.org/jira/browse/DRILL-7195 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Vova Vysotskyi >Assignee: Vova Vysotskyi >Priority: Major > Fix For: 1.17.0 > > > 1. For the case when a query contains filter with a {{cast}} which cannot be > done with {{is null}}, the query does not fail: > {code:sql} > select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null; > +---+ > | a | > +---+ > +---+ > No rows selected (0.142 seconds) > {code} > where > {noformat} > cat /tmp/a.json > {"a":"aaa"} > {noformat} > But for the case when this condition is specified in project, query, as it is > expected, fails: > {code:sql} > select cast(t.a as integer) is null from dfs.tmp.`a.json` t; > Error: SYSTEM ERROR: NumberFormatException: aaa > Fragment 0:0 > Please, refer to logs for more information. > [Error Id: ed3982ce-a12f-4d63-bc6e-cafddf28cc24 on user515050-pc:31010] > (state=,code=0) > {code} > This is a regression, for Drill 1.15 the first and the second queries are > failed: > {code:sql} > select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null; > Error: SYSTEM ERROR: NumberFormatException: aaa > Fragment 0:0 > Please, refer to logs for more information. > [Error Id: 2f878f15-ddaa-48cd-9dfb-45c04db39048 on user515050-pc:31010] > (state=,code=0) > {code} > 2. For the case when {{drill.exec.functions.cast_empty_string_to_null}} is > enabled, this issue will cause wrong results: > {code:sql} > alter system set `drill.exec.functions.cast_empty_string_to_null`=true; > select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null; > +---+ > | a | > +---+ > +---+ > No rows selected (1.759 seconds) > {code} > where > {noformat} > cat /tmp/a1.json > {"a":"1"} > {"a":""} > {noformat} > Result for Drill 1.15.0: > {code:sql} > select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null; > ++ > | a | > ++ > || > ++ > 1 row selected (1.724 seconds) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7401) Sqlline 1.9 upgrade
Arina Ielchiieva created DRILL-7401: --- Summary: Sqlline 1.9 upgrade Key: DRILL-7401 URL: https://issues.apache.org/jira/browse/DRILL-7401 Project: Apache Drill Issue Type: Task Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Upgrade to SqlLine 1.9 once it is released (https://github.com/julianhyde/sqlline/issues/350). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5929) Misleading error for text file with blank line delimiter
[ https://issues.apache.org/jira/browse/DRILL-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5929. - Resolution: Fixed Fixed with V3 reader introduction. Error message is the following: {{org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: The text format line delimiter cannot be blank.}} > Misleading error for text file with blank line delimiter > > > Key: DRILL-5929 > URL: https://issues.apache.org/jira/browse/DRILL-5929 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.17.0 > > > Consider the following functional test query: > {code} > select * from > table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\')) > {code} > For some reason (yet to be determined), when running this from Java, the line > delimiter ended up empty. This cases the following line to fail with an > {{ArrayIndexOutOfBoundsException}}: > {code} > class TextInput ... > public final byte nextChar() throws IOException { > if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0 > {code} > We then translate the exception: > {code} > class TextReader ... > public final boolean parseNext() throws IOException { > ... > } catch (Exception ex) { > try { > throw handleException(ex); > ... > private TextParsingException handleException(Exception ex) throws > IOException { > ... > if (ex instanceof ArrayIndexOutOfBoundsException) { > // Not clear this exception is still thrown... > ex = UserException > .dataReadError(ex) > .message( > "Drill failed to read your text file. Drill supports up to %d > columns in a text file. Your file appears to have more than that.", > MAXIMUM_NUMBER_COLUMNS) > .build(logger); > } > {code} > That is, due to a missing delimiter, we get an index out of bounds exception, > which we translate to an error about having too many fields. But, the file > itself has only a handful of fields. Thus, the error is completely wrong. > Then, we compound the error: > {code} > private TextParsingException handleException(Exception ex) throws > IOException { > ... > throw new TextParsingException(context, message, ex); > class CompliantTextReader ... > public boolean next() { > ... > } catch (IOException | TextParsingException e) { > throw UserException.dataReadError(e) > .addContext("Failure while reading file %s. Happened at or shortly > before byte position %d.", > split.getPath(), reader.getPos()) > .build(logger); > {code} > That is, our AIOB exception became a user exception that became a text > parsing exception that became a data read error. > But, this is not a data read error. It is an error in Drill's own validation > logic. Not clear we should be wrapping user exceptions in other errors that > we wrap in other user exceptions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6984) from table escape parameter not deleted when defined with value other than '"'
[ https://issues.apache.org/jira/browse/DRILL-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6984. - Resolution: Fixed Issue was fixed with v3 text reader introduction and several subsequent fixes. > from table escape parameter not deleted when defined with value other than > '"' > --- > > Key: DRILL-6984 > URL: https://issues.apache.org/jira/browse/DRILL-6984 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.15.0 >Reporter: benj >Priority: Minor > Fix For: 1.17.0 > > > When change the escape char, it's kept in the output instead of be deleted > like with '"' > Example : > the file sample.csv : > {code:java} > name|age|subject|info > Bob|21|"lovely quotes"|none > Bill|23|"character @"|@" pipe"|none > {code} > {code:java} > SELECT * FROM table(tmp`sample.csv`(`escape`=>'@', type => 'text', > fieldDelimiter => '|',quote=>'"', extractHeader => true)); > {code} > The result is > {code:java} > | name | age | subject | info | > +-+--+---+---+ > | Bob | 21 | lovely quotes | none | > | Bill | 23 | character @"|@" pipe | none | > {code} > As we expect : < character "|" pipe > (without the escape char (@) > > Note that we have the good behavior when using quote ('"') as escaping > character > {code:java} > name|age|subject|info > Bob|21|"lovely quotes"|none > Bill|23|"character ""|"" pipe"|none > {code} > {code:java} > SELECT * FROM table(tmp`sample.csv`(`escape`=>'"', type => 'text', > fieldDelimiter => '|',quote=>'"', extractHeader => true)); > OR > SELECT * FROM table(tmp`sample.csv`(type => 'text', fieldDelimiter => > '|',quote=>'"', extractHeader => true)); > {code} > The result is OK with > {code:java} > | name | age | subject | info | > +-+--+---+---+ > | Bob | 21 | lovely quotes | none | > | Bill| 23 | character "|" pipe | none | > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-4195) NullPointerException received on ResultSet.next() call for query
[ https://issues.apache.org/jira/browse/DRILL-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-4195. - Resolution: Cannot Reproduce > NullPointerException received on ResultSet.next() call for query > > > Key: DRILL-4195 > URL: https://issues.apache.org/jira/browse/DRILL-4195 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.2.0 > Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May > 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Sergio Lob >Priority: Major > Attachments: cre8table_trdmix91.sql, setup, test.class, test.java, > test.log, trdmix91.csv > > > NullPointerException received on ResultSet.next() call for a particular > query. We have several different queries that produce NullPointerException. > One is: > "SELECT T1.`fa02int` AS `SK001`, MAX(T1.`fa06char_5`) FROM > `hive`.`default`.`trdmix91` `T1` GROUP BY T1.`fa02int` ORDER BY `SK001`" > During invocation of rs.ext(), I receive the following > exception stack trace: > invoking ResultSet.next() to get to first row: > Exception: java.sql.SQLException: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: e7dc2d6e-ab32-4d6d-a593-7fe09a677393 on maprdemo:31010] > java.sql.SQLException: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: e7dc2d6e-ab32-4d6d-a593-7fe09a677393 on maprdemo:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor. > java:247) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:320) > at > oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.ja > va:187) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl > .java:160) > at test.main(test.java:64) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > SYSTEM E > RROR: NullPointerException > Fragment 0:0 > [Error Id: e7dc2d6e-ab32-4d6d-a593-7fe09a677393 on maprdemo:31010] > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived( > QueryResultHandler.java:118) > at > oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClie > nt.java:110) > at > oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(Basic > ClientWithConnection.java:47) > at > oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(Basic > ClientWithConnection.java:32) > at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.ja > va:233) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.ja > va:205) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(Messa > geToMessageDecoder.java:89) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead > (AbstractChannelHandlerContext.java:339) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(A > bstractChannelHandlerContext.java:324) > at > oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateH > andler.java:254) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead > (AbstractChannelHandlerContext.java:339) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(A > bstractChannelHandlerContext.java:324) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(Messa > geToMessageDecoder.java:103) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead > (AbstractChannelHandlerContext.java:339) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(A > bstractChannelHandlerContext.java:324) > at > oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMe > ssageDecoder.java:242) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead > (AbstractChannelHandlerContext.java:339) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(A > bstractChannelHandlerContext.java:324) > at > oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(Channe > lInboundHandlerAdapter.java:86) > at > oadd.io.netty.channe
[jira] [Resolved] (DRILL-5557) java.lang.IndexOutOfBoundsException: writerIndex:
[ https://issues.apache.org/jira/browse/DRILL-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5557. - Resolution: Fixed > java.lang.IndexOutOfBoundsException: writerIndex: > -- > > Key: DRILL-5557 > URL: https://issues.apache.org/jira/browse/DRILL-5557 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: renlu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5492) CSV reader does not validate header names, causes nonsense output
[ https://issues.apache.org/jira/browse/DRILL-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5492. - Resolution: Fixed > CSV reader does not validate header names, causes nonsense output > - > > Key: DRILL-5492 > URL: https://issues.apache.org/jira/browse/DRILL-5492 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > Consider the same test case as in DRILL-5491, but with a slightly different > input file: > {code} > ___ > a,b,c > d,e,f > {code} > The underscores represent three spaces: use spaces in the real test. > In this case, the code discussed in DRILL-5491 finds some characters and > happily returns the following array: > {code} > [" "] > {code} > The field name of three blanks is returned to the client to produce the > following bizarre output: > {code} > 2 row(s): > > a > d > {code} > The blank line is normally the header, but the header here was considered to > be three blanks. (In fact, the blanks are actually printed.) > Since the blanks were considered to be a field, the file is assumed to have > only one field, so only the first column was returned. > The expected behavior is that spaces are trimmed from field names, so the > field name list would be empty and a User Error thrown. (That is, it is > confusing to the user why a blank line produces NPE, some produce the > {{ExecutionSetupException}} shown in DRILL-5491, and some produce blank > headings. Behavior should be consistent. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5491) NPE when reading a CSV file, with headers, but blank header line
[ https://issues.apache.org/jira/browse/DRILL-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5491. - Resolution: Fixed > NPE when reading a CSV file, with headers, but blank header line > > > Key: DRILL-5491 > URL: https://issues.apache.org/jira/browse/DRILL-5491 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.8.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Fix For: 1.17.0 > > > See DRILL-5490 for background. > Try this unit test case: > {code} > FixtureBuilder builder = ClusterFixture.builder() > .maxParallelization(1); > try (ClusterFixture cluster = builder.build(); > ClientFixture client = cluster.clientFixture()) { > TextFormatConfig csvFormat = new TextFormatConfig(); > csvFormat.fieldDelimiter = ','; > csvFormat.skipFirstLine = false; > csvFormat.extractHeader = true; > cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat); > String sql = "SELECT * FROM `dfs.data`.`csv/test7.csv`"; > client.queryBuilder().sql(sql).printCsv(); > } > } > {code} > The test can also be run as a query using your favorite client. > Using this input file: > {code} > a,b,c > d,e,f > {code} > (The first line is blank.) > The following is the result: > {code} > Exception (no rows returned): > org.apache.drill.common.exceptions.UserRemoteException: > SYSTEM ERROR: NullPointerException > {code} > The {{RepeatedVarCharOutput}} class tries (but fails for the reasons outlined > in DRILL-5490) to detect this case. > The code crashes here in {{CompliantTextRecordReader.extractHeader()}}: > {code} > String [] fieldNames = ((RepeatedVarCharOutput)hOutput).getTextOutput(); > {code} > Because of bad code in {{RepeatedVarCharOutput.getTextOutput()}}: > {code} > public String [] getTextOutput () throws ExecutionSetupException { > if (recordCount == 0 || fieldIndex == -1) { > return null; > } > if (this.recordStart != characterData) { > throw new ExecutionSetupException("record text was requested before > finishing record"); > } > {code} > Since there is no text on the line, special code elsewhere (see DRILL-5490) > elects not to increment the {{recordCount}}. (BTW: {{recordCount}} is the > total across-batch count, probably the in-batch count, {{batchIndex}}, was > wanted here.) Since the count is zero, we return null. > But, if the author probably thought we'd get a zero-length record, and the > if-statement throws an exception in this case. But, see DRILL-5490 about why > this code does not actually work. > The result is one bug (not incrementing the record count), triggering another > (returning a null), which masks a third ({{recordStart}} is not set correctly > so the exception would not be thrown.) > All that bad code is just fun and games until we get an NPE, however. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-5451) Query on csv file w/ header fails with an exception when non existing column is requested if file is over 4096 lines long
[ https://issues.apache.org/jira/browse/DRILL-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5451. - Resolution: Fixed > Query on csv file w/ header fails with an exception when non existing column > is requested if file is over 4096 lines long > - > > Key: DRILL-5451 > URL: https://issues.apache.org/jira/browse/DRILL-5451 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text CSV >Affects Versions: 1.10.0 > Environment: Tested on CentOs 7 and Ubuntu >Reporter: Paul Wilson >Assignee: Paul Rogers >Priority: Major > Fix For: 1.17.0 > > Attachments: 4097_lines.csvh > > > When querying a text (csv) file with extractHeaders set to true, selecting a > non existent column works as expected (returns "empty" value) when file has > 4096 lines or fewer (1 header plus 4095 data), but results in an > IndexOutOfBoundsException where the file has 4097 lines or more. > With Storage config: > {code:javascript} > "csvh": { > "type": "text", > "extensions": [ > "csvh" > ], > "extractHeader": true, > "delimiter": "," > } > {code} > In the following 4096_lines.csvh has is identical to 4097_lines.csvh with the > last line removed. > Results: > {noformat} > 0: jdbc:drill:zk=local> select * from dfs.`/test/4097_lines.csvh` LIMIT 2; > +--++ > | line_no |line_description| > +--++ > | 2| this is line number 2 | > | 3| this is line number 3 | > +--++ > 2 rows selected (2.455 seconds) > 0: jdbc:drill:zk=local> select line_no, non_existent_field from > dfs.`/test/4096_lines.csvh` LIMIT 2; > +--+-+ > | line_no | non_existent_field | > +--+-+ > | 2| | > | 3| | > +--+-+ > 2 rows selected (2.248 seconds) > 0: jdbc:drill:zk=local> select line_no, non_existent_field from > dfs.`/test/4097_lines.csvh` LIMIT 2; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: eb0974a8-026d-4048-9f10-ffb821a0d300 on localhost:31010] > (java.lang.IndexOutOfBoundsException) index: 16384, length: 4 (expected: > range(0, 16384)) > io.netty.buffer.DrillBuf.checkIndexD():123 > io.netty.buffer.DrillBuf.chk():147 > io.netty.buffer.DrillBuf.getInt():520 > org.apache.drill.exec.vector.UInt4Vector$Accessor.get():358 > org.apache.drill.exec.vector.VarCharVector$Mutator.setValueCount():659 > org.apache.drill.exec.physical.impl.ScanBatch.next():234 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():104 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():94 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInfo
[jira] [Created] (DRILL-7397) Fix logback errors when building the project
Arina Ielchiieva created DRILL-7397: --- Summary: Fix logback errors when building the project Key: DRILL-7397 URL: https://issues.apache.org/jira/browse/DRILL-7397 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Bohdan Kazydub {noformat} [INFO] Compiling 75 source files to /.../drill/common/target/classes [WARNING] Unable to autodetect 'javac' path, using 'javac' from the environment. [INFO] [INFO] --- exec-maven-plugin:1.6.0:java (default) @ drill-common --- 17:46:05,674 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy] 17:46:05,675 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback-test.xml] at [file:/.../drill/common/src/test/resources/logback-test.xml] 17:46:05,712 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 17:46:05,714 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 17:46:05,714 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 17:46:05,714 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 17:46:05,719 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 17:46:05,724 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 17:46:05,740 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT level set to ERROR 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - Could not find Janino library on the class path. Skipping conditional processing. 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See also http://logback.qos.ch/codes.html#ifJanino 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.action.AppenderRefAction - Could not find an AppenderAttachable at the top of execution stack. Near [appender-ref] line 59 17:46:05,740 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - The object on the top the of the stack is not the root logger 17:46:05,740 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - It is: ch.qos.logback.core.joran.conditional.IfAction 17:46:05,740 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration. 17:46:05,741 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@58e3a2c7 - Registering current configuration as safe fallback point {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6885) CTAS for empty output doesn't create parquet file or folder
[ https://issues.apache.org/jira/browse/DRILL-6885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6885. - Resolution: Fixed > CTAS for empty output doesn't create parquet file or folder > --- > > Key: DRILL-6885 > URL: https://issues.apache.org/jira/browse/DRILL-6885 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.14.0 >Reporter: Vitalii Diravka >Priority: Major > Fix For: 1.17.0 > > > CTAS for empty output can create empty tables based on the empty json or csv > files. But it doesn't work for parquet files. > See examples below: > {code:java} > 0: jdbc:drill:zk=local> use dfs.tmp; > +---+--+ > | ok | summary| > +---+--+ > | true | Default schema changed to [dfs.tmp] | > +---+--+ > 1 row selected (0.087 seconds) > 0: jdbc:drill:zk=local> select * from `empty_dir`; > +--+ > | | > +--+ > +--+ > No rows selected (0.083 seconds) > 0: jdbc:drill:zk=local> alter session set `store.format` = 'json'; > +---++ > | ok |summary | > +---++ > | true | store.format updated. | > +---++ > 1 row selected (0.079 seconds) > 0: jdbc:drill:zk=local> create table `empty_json` as select * from > `empty_dir`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 0 | > +---++ > 1 row selected (0.128 seconds) > 0: jdbc:drill:zk=local> select * from `empty_json`; > +--+ > | | > +--+ > +--+ > No rows selected (0.086 seconds) > 0: jdbc:drill:zk=local> alter session set `store.format` = 'csv'; > +---++ > | ok |summary | > +---++ > | true | store.format updated. | > +---++ > 1 row selected (0.073 seconds) > 0: jdbc:drill:zk=local> create table `empty_csv` as select * from `empty_dir`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 0 | > +---++ > 1 row selected (0.135 seconds) > 0: jdbc:drill:zk=local> select * from `empty_csv`; > +--+ > | columns | > +--+ > | [] | > +--+ > 1 row selected (0.086 seconds) > 0: jdbc:drill:zk=local> alter session set `store.format` = 'parquet'; > +---++ > | ok |summary | > +---++ > | true | store.format updated. | > +---++ > 1 row selected (0.073 seconds) > 0: jdbc:drill:zk=local> create table `empty_parquet` as select * from > `empty_dir`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 0 | > +---++ > 1 row selected (0.099 seconds) > 0: jdbc:drill:zk=local> select * from `empty_parquet`; > 20:41:01.619 [23f692c1-8994-9fc8-2ce4-5fc6135ebcc9:foreman] ERROR > o.a.calcite.runtime.CalciteException - > org.apache.calcite.sql.validate.SqlValidatorException: Object 'empty_parquet' > not found > 20:41:01.619 [23f692c1-8994-9fc8-2ce4-5fc6135ebcc9:foreman] ERROR > o.a.calcite.runtime.CalciteException - > org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to > line 1, column 29: Object 'empty_parquet' not found > 20:41:01.622 [Client-1] ERROR o.a.calcite.runtime.CalciteException - > org.apache.calcite.sql.validate.SqlValidatorException: Object 'empty_parquet' > not found > 20:41:01.623 [Client-1] ERROR o.a.calcite.runtime.CalciteException - > org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to > line 1, column 29: Object 'empty_parquet' not found: Object 'empty_parquet' > not found > Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 29: Object > 'empty_parquet' not found > [Error Id: 879730dc-aad6-4fc7-9c62-9ad8bbc99d42 on vitalii-pc:31010] > (state=,code=0) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-4949) Need better handling of empty parquet files
[ https://issues.apache.org/jira/browse/DRILL-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-4949. - Resolution: Fixed > Need better handling of empty parquet files > --- > > Key: DRILL-4949 > URL: https://issues.apache.org/jira/browse/DRILL-4949 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.9.0 >Reporter: Krystal >Priority: Major > Fix For: 1.17.0 > > > I have an empty parquet file created from hive. When I tried to query > against this table I got "IllegalArgumentException". > {code} > select * from `test_dir/voter_empty`; > Error: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has no read > entries assigned > (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception > during fragment initialization: MinorFragmentId 0 has no read entries assigned > org.apache.drill.exec.work.foreman.Foreman.run():281 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():745 > Caused By (java.lang.IllegalArgumentException) MinorFragmentId 0 has no > read entries assigned > com.google.common.base.Preconditions.checkArgument():122 > org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan():824 > org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan():101 > org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan():68 > org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan():35 > org.apache.drill.exec.physical.base.AbstractGroupScan.accept():63 > org.apache.drill.exec.planner.fragment.Materializer.visitOp():102 > org.apache.drill.exec.planner.fragment.Materializer.visitOp():35 > > org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject():79 > org.apache.drill.exec.physical.config.Project.accept():51 > org.apache.drill.exec.planner.fragment.Materializer.visitStore():82 > org.apache.drill.exec.planner.fragment.Materializer.visitStore():35 > > org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen():202 > org.apache.drill.exec.physical.config.Screen.accept():98 > > org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit():283 > > org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments():127 > org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit():596 > org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan():426 > org.apache.drill.exec.work.foreman.Foreman.runSQL():1010 > org.apache.drill.exec.work.foreman.Foreman.run():264 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():745 (state=,code=0) > {code} > Either drill should block the query and display a user friendly error message > or allow the query to run and return empty result. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-1834) Misleading error message when querying an empty Parquet file
[ https://issues.apache.org/jira/browse/DRILL-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-1834. - Resolution: Fixed > Misleading error message when querying an empty Parquet file > > > Key: DRILL-1834 > URL: https://issues.apache.org/jira/browse/DRILL-1834 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 0.7.0 >Reporter: Aman Sinha >Priority: Minor > Fix For: 1.17.0 > > > It is possible that a CTAS may fail and create an empty Parquet file. When > we run a query against this file, we get a misleading error message from the > planner that hides the original IOException, although the log file does have > the original exception: > {code:sql} > 0: jdbc:drill:zk=local> select count(*) from dfs.`/tmp/empty.parquet`; > Query failed: Query failed: Unexpected exception during fragment > initialization: Internal error: Error while applying rule > DrillPushProjIntoScan, args > [rel#77:ProjectRel.NONE.ANY([]).[](child=rel#76:Subset#0.ENUMERABLE.ANY([]).[],$f0=0), > rel#68:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, > /tmp/empty.parquet])] > {code} > The cause of the exception is in the logs: > Caused by: java.io.IOException: Could not read footer: > java.lang.RuntimeException: file:/tmp/empty.parquet is not a Parquet file > (too small) > at > parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:195) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT] > at > parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:208) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT] > at > parquet.hadoop.ParquetFileReader.readFooters(ParquetFileReader.java:224) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT] > at > org.apache.drill.exec.store.parquet.ParquetGroupScan.readFooter(ParquetGroupScan.java:208) > ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run
Arina Ielchiieva created DRILL-7393: --- Summary: Revisit Drill tests to ensure that patching is executed before any test run Key: DRILL-7393 URL: https://issues.apache.org/jira/browse/DRILL-7393 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, ProtobufPatcher), patching should be done before classes to be patched are loaded. That's why this operation is executed in static block in Drillbit class. Some tests in java-exec module use Drillbit class, some extend DrillTest class, both of them patch Guava. But there are some tests that do not call patcher but load classes to be patched. For example, {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava Preconditions class. If such tests run before tests that require patching, tests run will fail since patching won't be successful. Patchers code does not fail application if patching was not complete, just logs warning ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard to identify unit tests failure root cause. We need to revisit all Drill tests to ensure that all of them extend common test base class which patchers Protobuf and Guava classes in static block. Also refactor Patcher classes to have assert so patching fails during unit testing if there are any problems. After all tests are revised, we can remove {{metastore-test}} execution from main.xml in {{maven-surefire-plugin}} which was added to ensure that all Metastore tests run in a separate JVM where patching is done in first place since Iceberg Metastore heavily depends on patched Guava Preconditions class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-6835) Schema Provision using File / Table Function
[ https://issues.apache.org/jira/browse/DRILL-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6835. - Resolution: Fixed > Schema Provision using File / Table Function > > > Key: DRILL-6835 > URL: https://issues.apache.org/jira/browse/DRILL-6835 > Project: Apache Drill > Issue Type: New Feature > Reporter: Arina Ielchiieva > Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > Schema Provision using File / Table Function design document: > https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit?usp=sharing > Phase 1 functional specification - > https://docs.google.com/document/d/1ExVgx2FDqxAz5GTqyWt-_1-UqwRSTGLGEYuc8gsESG8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7368) Query from Iceberg Metastore fails if filter column contains null
Arina Ielchiieva created DRILL-7368: --- Summary: Query from Iceberg Metastore fails if filter column contains null Key: DRILL-7368 URL: https://issues.apache.org/jira/browse/DRILL-7368 Project: Apache Drill Issue Type: Bug Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 When querying data from Drill Iceberg Metastore query fails if filter column contains null. Problem is in Iceberg implementation - https://github.com/apache/incubator-iceberg/pull/443 Fix steps: upgrade to latest Iceberg commit which includes appropriate fix. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7367) Remove Server details from response headers
Arina Ielchiieva created DRILL-7367: --- Summary: Remove Server details from response headers Key: DRILL-7367 URL: https://issues.apache.org/jira/browse/DRILL-7367 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Drill response headers include Server information which is considered to be a vulnerability. {noformat} curl http://localhost:8047/cluster.json -v -k * Trying ::1... * TCP_NODELAY set * Connected to localhost (::1) port 8047 (#0) > GET /cluster.json HTTP/1.1 > Host: localhost:8047 > User-Agent: curl/7.54.0 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 05 Sep 2019 12:47:53 GMT < Content-Type: application/json < Content-Length: 436 < Server: Jetty(9.3.25.v20180904) {noformat} https://pentest-tools.com/blog/essential-http-security-headers/ -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: [ANNOUNCE] New PMC Chair of Apache Drill
Definitely wrong topic to ask such kind of questions. Anyway, here the answers to your questions: 1. Not planned in near future. 2. Will be supported in Drill 1.17. Kind regards, Arina On Wed, Aug 28, 2019 at 2:30 PM wrote: > Hallo guys, > some news about: > 1) "INSERT STATEMENT" into parquet files??? > 2) creation of empty parquet files? > > Thks > Alessandro > > > > > -Messaggio originale- > Da: weijie tong > Inviato: lunedì 26 agosto 2019 04:43 > A: dev > Cc: u...@drill.apache.org > Oggetto: Re: [ANNOUNCE] New PMC Chair of Apache Drill > > Congratulations Charles. > > On Sat, Aug 24, 2019 at 11:33 AM Robert Hou wrote: > > > Congratulations Charles, and thanks for your contributions to Drill! > > > > Thank you Arina for all you have done as PMC Chair this past year. > > > > --Robert > > > > On Fri, Aug 23, 2019 at 4:16 PM Khurram Faraaz > > wrote: > > > > > Congratulations Charles, and thank you Arina. > > > > > > Regards, > > > Khurram > > > > > > On Fri, Aug 23, 2019 at 2:54 PM Niels Basjes wrote: > > > > > > > Congratulations Charles. > > > > > > > > Niels Basjes > > > > > > > > On Thu, Aug 22, 2019, 09:28 Arina Ielchiieva > wrote: > > > > > > > > > Hi all, > > > > > > > > > > It has been a honor to serve as Drill Chair during the past year > > > > > but > > > it's > > > > > high time for the new one... > > > > > > > > > > I am very pleased to announce that the Drill PMC has voted to > > > > > elect > > > > Charles > > > > > Givre as the new PMC chair of Apache Drill. He has also been > > > > > approved unanimously by the Apache Board in last board meeting. > > > > > > > > > > Congratulations, Charles! > > > > > > > > > > Kind regards, > > > > > Arina > > > > > > > > > > > > > > > >
[jira] [Created] (DRILL-7361) Add Map (Dict) support for schema file provisioning
Arina Ielchiieva created DRILL-7361: --- Summary: Add Map (Dict) support for schema file provisioning Key: DRILL-7361 URL: https://issues.apache.org/jira/browse/DRILL-7361 Project: Apache Drill Issue Type: New Feature Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Once Dict is added to row set framework, schema commands must be able to process this type. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7360) Refactor WatchService code in Drillbit class
Arina Ielchiieva created DRILL-7360: --- Summary: Refactor WatchService code in Drillbit class Key: DRILL-7360 URL: https://issues.apache.org/jira/browse/DRILL-7360 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Refactor WatchService to user proper code (see https://docs.oracle.com/javase/tutorial/essential/io/notification.html for details) and fix concurrency issues connected with variables assigning from different thread. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7358) Text reader returns nothing for count queries over empty files
Arina Ielchiieva created DRILL-7358: --- Summary: Text reader returns nothing for count queries over empty files Key: DRILL-7358 URL: https://issues.apache.org/jira/browse/DRILL-7358 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Paul Rogers Fix For: 1.17.0 If we do count over empty CSV files (with or without headers), there is not result. Though expecting result is 0. Unit tests examples: {code} @Test public void testCount() throws Exception { String fileName = "headersOnly.csv"; try (PrintWriter out = new PrintWriter(new FileWriter(new File(testDir, fileName { out.print("a,b,c"); // note: no \n in the end } queryBuilder().sql("SELECT count(1) FROM `dfs.data`.`" + fileName + "`").print(); } {code} {code} @Test public void testCount() throws Exception { String fileName = "empty.csv"; File file = new File(testDir, fileName); assertTrue(file.createNewFile()); queryBuilder().sql("SELECT count(1) FROM `dfs.data`.`" + fileName + "`").print(); } {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7357) Expose Drill Metastore Metadata through INFORMATION_SCHEMA
Arina Ielchiieva created DRILL-7357: --- Summary: Expose Drill Metastore Metadata through INFORMATION_SCHEMA Key: DRILL-7357 URL: https://issues.apache.org/jira/browse/DRILL-7357 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Document: https://docs.google.com/document/d/10CkLdrlUJUNRrHKLeo8jTUJB8xAP1D0byTOvn8wNoF0/edit#heading=h.gzj2dj5a4yds Sections: 5.19 INFORMATION_SCHEMA updates 4.3.2 Using the statistics -- This message was sent by Atlassian Jira (v8.3.2#803003)
[ANNOUNCE] New PMC Chair of Apache Drill
Hi all, It has been a honor to serve as Drill Chair during the past year but it's high time for the new one... I am very pleased to announce that the Drill PMC has voted to elect Charles Givre as the new PMC chair of Apache Drill. He has also been approved unanimously by the Apache Board in last board meeting. Congratulations, Charles! Kind regards, Arina
[jira] [Created] (DRILL-7347) Upgrade Apache Iceberg to released version
Arina Ielchiieva created DRILL-7347: --- Summary: Upgrade Apache Iceberg to released version Key: DRILL-7347 URL: https://issues.apache.org/jira/browse/DRILL-7347 Project: Apache Drill Issue Type: Task Reporter: Arina Ielchiieva Currently Drill uses Apache Iceberg build on certain commit using JitPack since there is no official released version. Once Iceberg first version is released, we need to use officially released version instead of commit. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (DRILL-6528) Planner setting the wrong number of records to read (Parquet Reader)
[ https://issues.apache.org/jira/browse/DRILL-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-6528. - Resolution: Fixed Resolved in the scope of DRILL-4517. > Planner setting the wrong number of records to read (Parquet Reader) > > > Key: DRILL-6528 > URL: https://issues.apache.org/jira/browse/DRILL-6528 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Reporter: salim achouche >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > - Recently fixed the Flat Parquet reader to honor the number of records to > read > - Though few tests failed: > TestUnionDistinct.testUnionDistinctEmptySides:356 Different number of records > returned expected:<5> but was:<1> > TestUnionAll.testUnionAllEmptySides:355 Different number of records returned > expected:<5> but was:<1> > - I debugged one of them and realized the Planner was setting the wrong > number of rows to read (in this case, one) > - You can put a break point and see this happening: > Class: ParquetGroupScan > Method: updateRowGroupInfo(long maxRecords) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: August Apache Drill board report
Thanks everybody for the feedback, made changes according to the feedbacks and submitted the report. Final report draft: ## Description: - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - Drill User Meetup was held on May 22, 2019. - Drill 1.17.0 release is planned in the end of August / beginning of September. ## Health report: - Development activity is almost 50% down due to acquisition of one of the main Drill vendors. - Activity on the dev and user mailing lists is slightly down compared to previous periods. - Four committers were added in the last period. ## PMC changes: - Currently 24 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Sorabh Hamirwasia on Fri Apr 05 2019 ## Committer base changes: - Currently 55 committers. - New commmitters: - Anton Gozhiy was added as a committer on Mon Jul 22 2019 - Bohdan Kazydub was added as a committer on Mon Jul 15 2019 - Igor Guzenko was added as a committer on Mon Jul 22 2019 - Venkata Jyothsna Donapati was added as a committer on Mon May 13 2019 ## Releases: - Last release was 1.16.0 on Thu May 02 2019 ## Mailing list activity: - dev@drill.apache.org: - 403 subscribers (down -5 in the last 3 months): - 1156 emails sent to list ( in previous quarter) - iss...@drill.apache.org: - 17 subscribers (up 0 in the last 3 months): - 1496 emails sent to list (2315 in previous quarter) - u...@drill.apache.org: - 575 subscribers (down -6 in the last 3 months): - 157 emails sent to list (230 in previous quarter) ## JIRA activity: - 96 JIRA tickets created in the last 3 months - 68 JIRA tickets closed/resolved in the last 3 months On Thu, Aug 8, 2019 at 9:38 PM Aman Sinha wrote: > Thanks for putting this together Arina. One minor comment is that for a > future release do we need to mention the feature set ? > Typically we would enumerate those in the next board report after the > release has happened. > > Aman > > On Thu, Aug 8, 2019 at 10:00 AM Sorabh Hamirwasia > wrote: > > > Hi Arina, > > Overall report looks good. One minor thing: > > - Drill User Meetup was be held on May 22, 2019. > > > > > > Thanks, > > > > Sorabh > > > > On Thu, Aug 8, 2019 at 7:05 AM Arina Ielchiieva > wrote: > > > > > Hi all, > > > > > > please take a look at the draft board report for the last quarter and > let > > > me know if you have any comments. > > > > > > Thanks, > > > Arina > > > > > > = > > > > > > ## Description: > > > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud > > >Storage. > > > > > > ## Issues: > > > - There are no issues requiring board attention at this time. > > > > > > ## Activity: > > > - Drill User Meetup was be held on May 22, 2019. > > > - Drill 1.17.0 release is planned in the end of August / beginning > > > September, > > >it will include the following improvements: > > >- Drill Metastore implementation based on Iceberg tables and > > integration > > >- Hive arrays / structs support > > >- Canonical Map support > > >- Vararg UDFs support > > >- Run-time row group pruning > > >- Schema provisioning via table function > > >- Empty parquet files read / write support > > > > > > ## Health report: > > > - Development activity is almost 50% down due to acquisition one of > the > > > main Drill vendors. > > > - Activity on the dev and user mailing lists is slightly down compared > > to > > > previous periods. > > > - Four committers were added in the last period. > > > > > > ## PMC changes: > > > > > > - Currently 24 PMC members. > > > - No new PMC members added in the last 3 months > > > - Last PMC addition was Sorabh Hamirwasia on Fri Apr 05 2019 > > > > > > ## Committer base changes: > > > > > > - Currently 55 committers. > > > - New commmitters: > > > - Anton Gozhiy was added as a committer on Mon Jul 22 2019 > > > - Bohdan Kazydub was added as a committer on Mon Jul 15 2019 > > > - Igor Guzenko was added as a committer on Mon Jul 22 2019 > > > - Venkata Jyothsna Donapati was added as a committer on Mon May 13 > > 2019 > > > > > > ## Releases: > > > > > > - Last release was 1.16.0 on Thu May 02 2019 > > > > > > #
August Apache Drill board report
Hi all, please take a look at the draft board report for the last quarter and let me know if you have any comments. Thanks, Arina = ## Description: - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - Drill User Meetup was be held on May 22, 2019. - Drill 1.17.0 release is planned in the end of August / beginning September, it will include the following improvements: - Drill Metastore implementation based on Iceberg tables and integration - Hive arrays / structs support - Canonical Map support - Vararg UDFs support - Run-time row group pruning - Schema provisioning via table function - Empty parquet files read / write support ## Health report: - Development activity is almost 50% down due to acquisition one of the main Drill vendors. - Activity on the dev and user mailing lists is slightly down compared to previous periods. - Four committers were added in the last period. ## PMC changes: - Currently 24 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Sorabh Hamirwasia on Fri Apr 05 2019 ## Committer base changes: - Currently 55 committers. - New commmitters: - Anton Gozhiy was added as a committer on Mon Jul 22 2019 - Bohdan Kazydub was added as a committer on Mon Jul 15 2019 - Igor Guzenko was added as a committer on Mon Jul 22 2019 - Venkata Jyothsna Donapati was added as a committer on Mon May 13 2019 ## Releases: - Last release was 1.16.0 on Thu May 02 2019 ## Mailing list activity: - dev@drill.apache.org: - 403 subscribers (down -5 in the last 3 months): - 1156 emails sent to list ( in previous quarter) - iss...@drill.apache.org: - 17 subscribers (up 0 in the last 3 months): - 1496 emails sent to list (2315 in previous quarter) - u...@drill.apache.org: - 575 subscribers (down -6 in the last 3 months): - 157 emails sent to list (230 in previous quarter) ## JIRA activity: - 96 JIRA tickets created in the last 3 months - 68 JIRA tickets closed/resolved in the last 3 months
[jira] [Created] (DRILL-7339) Upgrade to Iceberg latest commits to fix issue with orphan files after delete in transaction
Arina Ielchiieva created DRILL-7339: --- Summary: Upgrade to Iceberg latest commits to fix issue with orphan files after delete in transaction Key: DRILL-7339 URL: https://issues.apache.org/jira/browse/DRILL-7339 Project: Apache Drill Issue Type: Task Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Drill Metastore executes many operations in transaction including delete. Currently Iceberg creates orphan files when executing delete in transaction and these files cannot be expired and keep pilling up. Iceberg issue - https://github.com/apache/incubator-iceberg/issues/330. When #330 is fixed, we need to update Iceberg commit to ensure these issue is resolved in Drill as well. PR with the fix - https://github.com/apache/incubator-iceberg/pull/352 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7335) Error when reading csv file with headers only
Arina Ielchiieva created DRILL-7335: --- Summary: Error when reading csv file with headers only Key: DRILL-7335 URL: https://issues.apache.org/jira/browse/DRILL-7335 Project: Apache Drill Issue Type: Improvement Components: Storage - Text CSV Affects Versions: 1.16.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Prerequisites: file contains only headers with /n: id,name. Error: {noformat} org.apache.drill.exec.rpc.RpcException: org.apache.drill.common.exceptions.UserRemoteException: EXECUTION_ERROR ERROR: File file:onlyHeaders.csv Fragment 0:0 {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7334) Update Iceberg Metastore Parquet write mode
Arina Ielchiieva created DRILL-7334: --- Summary: Update Iceberg Metastore Parquet write mode Key: DRILL-7334 URL: https://issues.apache.org/jira/browse/DRILL-7334 Project: Apache Drill Issue Type: Improvement Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 Initially Iceberg Parquet writer was using OVERWRITE mode by default. After f4fc8ff, default mode is CREATE. Need to update Iceberg Metastore code to comply with the latest changes. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: [ANNOUNCE] New Committer: Anton Gozhyi
Congratulations Anton! Thanks for your contributions. Kind regards, Arina On Mon, Jul 29, 2019 at 12:55 PM Павел Семенов wrote: > Congratulations Anton ! Well done. > > пн, 29 июл. 2019 г. в 12:54, Bohdan Kazydub : > > > Congratulations Anton! > > > > On Mon, Jul 29, 2019 at 12:44 PM Igor Guzenko < > ihor.huzenko@gmail.com> > > wrote: > > > > > Congratulations Anton! > > > > > > On Mon, Jul 29, 2019 at 12:09 PM denysord88 > > wrote: > > > > > > > Congratulations Anton! Well deserved! > > > > > > > > On 07/29/2019 12:02 PM, Volodymyr Vysotskyi wrote: > > > > > The Project Management Committee (PMC) for Apache Drill has invited > > > Anton > > > > > Gozhyi to become a committer, and we are pleased to announce that > he > > > has > > > > > accepted. > > > > > > > > > > Anton Gozhyi has been contributing to Drill for more than a year > and > > a > > > > > half. He did significant contributions as a QA, including reporting > > > > > non-trivial issues and working on automation of Drill tests. All > the > > > > issues > > > > > reported by Anton have a clear description of the problem, steps to > > > > > reproduce and expected behavior. Besides contributions as a QA, > Anton > > > > made > > > > > high-quality fixes into Drill. > > > > > > > > > > Welcome Anton, and thank you for your contributions! > > > > > > > > > > - Volodymyr > > > > > (on behalf of Drill PMC) > > > > > > > > > > > > > > > > > > > > > -- > > *Kind regards,* > *Pavel Semenov* >
[jira] [Created] (DRILL-7331) Support Icebrg
Arina Ielchiieva created DRILL-7331: --- Summary: Support Icebrg Key: DRILL-7331 URL: https://issues.apache.org/jira/browse/DRILL-7331 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.17.0 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7330) Implement metadata usage for text format plugin
Arina Ielchiieva created DRILL-7330: --- Summary: Implement metadata usage for text format plugin Key: DRILL-7330 URL: https://issues.apache.org/jira/browse/DRILL-7330 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Volodymyr Vysotskyi -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7329) Implement metadata usage for parquet format plugin
Arina Ielchiieva created DRILL-7329: --- Summary: Implement metadata usage for parquet format plugin Key: DRILL-7329 URL: https://issues.apache.org/jira/browse/DRILL-7329 Project: Apache Drill Issue Type: Sub-task Reporter: Arina Ielchiieva Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 1. Implement retrieval steps (include auto-refresh policy, retry attempts, fallback to file metadata). 2. Change the current group scan to leverage Schema from Metastore; 3. Verify that metadata is used correctly for metastore classes Add options: planner.metadata.use_schema planner.metadata.use_statistics metadata.ctas.auto_collect metadata.fallback_to_file_metadata metastore.auto-refresh -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[ANNOUNCE] New Committer: Igor Guzenko
The Project Management Committee (PMC) for Apache Drill has invited Igor Guzenko to become a committer, and we are pleased to announce that he has accepted. Igor has been contributing into Drill for 9 months and made a number of significant contributions, including cross join syntax support, Hive views support, as well as improving performance for Hive show schema and unit tests. Currently he is working on supporting Hive complex types [DRILL-3290]. He already added support for list type and working on struct and canonical map. Welcome Igor, and thank you for your contributions! - Arina (on behalf of the Apache Drill PMC)
[ANNOUNCE] New Committer: Bohdan Kazydub
The Project Management Committee (PMC) for Apache Drill has invited Bohdan Kazydub to become a committer, and we are pleased to announce that he has accepted. Bohdan has been contributing into Drill for more than a year. His contributions include logging and various functions handling improvements, planning optimizations and S3 improvements / fixes. His recent work includes Calcite 1.19 / 1.20 [DRILL-7200] and implementation of canonical Map [DRILL-7096]. Welcome Bohdan, and thank you for your contributions! - Arina (on behalf of the Apache Drill PMC)