[jira] [Commented] (DRILL-4120) dir0 does not work when the directory structure contains Avro files

2015-11-21 Thread Bhallamudi Venkata Siva Kamesh (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020820#comment-15020820
 ] 

Bhallamudi Venkata Siva Kamesh commented on DRILL-4120:
---

Hi Jacques,
 I will look into this.

> dir0 does not work when the directory structure contains Avro files
> ---
>
> Key: DRILL-4120
> URL: https://issues.apache.org/jira/browse/DRILL-4120
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.3.0
>Reporter: Stefán Baxter
>Assignee: Bhallamudi Venkata Siva Kamesh
> Fix For: 1.4.0
>
>
> Any select statment containing dirN fails if the target directory structure 
> contains Avro files.
> Steps to test:
> 1. create a simple directory structure
> 2. copy an avro file into each directory
> 3. execute a query containing dir0
> outcome:
> Error: VALIDATION ERROR: From line 1, column 117 to line 1, column 120: 
> Column 'dir0' not found in any table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4120) dir0 does not work when the directory structure contains Avro files

2015-11-21 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-4120:
--
Assignee: Bhallamudi Venkata Siva Kamesh

> dir0 does not work when the directory structure contains Avro files
> ---
>
> Key: DRILL-4120
> URL: https://issues.apache.org/jira/browse/DRILL-4120
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.3.0
>Reporter: Stefán Baxter
>Assignee: Bhallamudi Venkata Siva Kamesh
> Fix For: 1.4.0
>
>
> Any select statment containing dirN fails if the target directory structure 
> contains Avro files.
> Steps to test:
> 1. create a simple directory structure
> 2. copy an avro file into each directory
> 3. execute a query containing dir0
> outcome:
> Error: VALIDATION ERROR: From line 1, column 117 to line 1, column 120: 
> Column 'dir0' not found in any table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4120) dir0 does not work when the directory structure contains Avro files

2015-11-21 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020712#comment-15020712
 ] 

Jacques Nadeau commented on DRILL-4120:
---

Looks like this is a regression due to 
https://issues.apache.org/jira/browse/DRILL-3810

We added schema validation for Avro. We should also validate against dir 
columns.

[~kam_iitkgp], can you take a look?

> dir0 does not work when the directory structure contains Avro files
> ---
>
> Key: DRILL-4120
> URL: https://issues.apache.org/jira/browse/DRILL-4120
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.3.0
>Reporter: Stefán Baxter
>Assignee: Bhallamudi Venkata Siva Kamesh
> Fix For: 1.4.0
>
>
> Any select statment containing dirN fails if the target directory structure 
> contains Avro files.
> Steps to test:
> 1. create a simple directory structure
> 2. copy an avro file into each directory
> 3. execute a query containing dir0
> outcome:
> Error: VALIDATION ERROR: From line 1, column 117 to line 1, column 120: 
> Column 'dir0' not found in any table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3810) Filesystem plugin's support for file format's schema

2015-11-21 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-3810:
--
Assignee: Bhallamudi Venkata Siva Kamesh

> Filesystem plugin's support for file format's schema
> 
>
> Key: DRILL-3810
> URL: https://issues.apache.org/jira/browse/DRILL-3810
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON, Storage - Other, Storage - Parquet, 
> Storage - Text & CSV
>Reporter: Bhallamudi Venkata Siva Kamesh
>Assignee: Bhallamudi Venkata Siva Kamesh
> Fix For: 1.3.0
>
>
> Filesystem Plugin supports multiple type of file formats like
>   * json
>   * avro
>   * text (csv|psv|tsv)
>   * parquet
> and can support any type of file formats.
> Among these file formats, some of the file formats are schema based like
> *avro* and *parquet* and some of them are schema less like *json*.
> For schema based file formats, Drill should have capability to validate the 
> query against file schema, before start executing the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4120) dir0 does not work when the directory structure contains Avro files

2015-11-21 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-4120:
--
Fix Version/s: (was: 1.3.0)
   1.4.0

> dir0 does not work when the directory structure contains Avro files
> ---
>
> Key: DRILL-4120
> URL: https://issues.apache.org/jira/browse/DRILL-4120
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.3.0
>Reporter: Stefán Baxter
> Fix For: 1.4.0
>
>
> Any select statment containing dirN fails if the target directory structure 
> contains Avro files.
> Steps to test:
> 1. create a simple directory structure
> 2. copy an avro file into each directory
> 3. execute a query containing dir0
> outcome:
> Error: VALIDATION ERROR: From line 1, column 117 to line 1, column 120: 
> Column 'dir0' not found in any table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4120) dir0 does not work when the directory structure contains Avro files

2015-11-21 Thread JIRA
Stefán Baxter created DRILL-4120:


 Summary: dir0 does not work when the directory structure contains 
Avro files
 Key: DRILL-4120
 URL: https://issues.apache.org/jira/browse/DRILL-4120
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.3.0
Reporter: Stefán Baxter
 Fix For: 1.3.0


Any select statment containing dirN fails if the target directory structure 
contains Avro files.

Steps to test:
1. create a simple directory structure
2. copy an avro file into each directory
3. execute a query containing dir0

outcome:
Error: VALIDATION ERROR: From line 1, column 117 to line 1, column 120: Column 
'dir0' not found in any table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4119) Skew in hash distribution for varchar (and possibly other) types of data

2015-11-21 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020672#comment-15020672
 ] 

Jacques Nadeau commented on DRILL-4119:
---

Sounds good.

> Skew in hash distribution for varchar (and possibly other) types of data
> 
>
> Key: DRILL-4119
> URL: https://issues.apache.org/jira/browse/DRILL-4119
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.3.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> We are seeing substantial skew for an Id column that contains varchar data of 
> length 32.   It is easily reproducible by a group-by query: 
> {noformat}
> Explain plan for SELECT SomeId From table GROUP BY SomeId;
> ...
> 01-02  HashAgg(group=[{0}])
> 01-03Project(SomeId=[$0])
> 01-04  HashToRandomExchange(dist0=[[$0]])
> 02-01UnorderedMuxExchange
> 03-01  Project(SomeId=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
> 03-02HashAgg(group=[{0}])
> 03-03  Project(SomeId=[$0])
> {noformat}
> The string id happens to be of the following type: 
> {noformat}
> e4b4388e8865819126cb0e4dcaa7261d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4119) Skew in hash distribution for varchar (and possibly other) types of data

2015-11-21 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020665#comment-15020665
 ] 

Aman Sinha commented on DRILL-4119:
---

Yes, it would be useful to have a suite for the hashing.  The number of 
combinations is large:  num_data_types x nullability x num_hash_function_types 
(32bit, 64bit, AsDouble variations).  Plus, the nature of the data itself - we 
need real world data for testing the quality of the distribution.  Let me see 
if I can at least have a minimal test suite with some sample of the above 
combinations.   I may end up creating a separate JIRA.

> Skew in hash distribution for varchar (and possibly other) types of data
> 
>
> Key: DRILL-4119
> URL: https://issues.apache.org/jira/browse/DRILL-4119
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.3.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> We are seeing substantial skew for an Id column that contains varchar data of 
> length 32.   It is easily reproducible by a group-by query: 
> {noformat}
> Explain plan for SELECT SomeId From table GROUP BY SomeId;
> ...
> 01-02  HashAgg(group=[{0}])
> 01-03Project(SomeId=[$0])
> 01-04  HashToRandomExchange(dist0=[[$0]])
> 02-01UnorderedMuxExchange
> 03-01  Project(SomeId=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
> 03-02HashAgg(group=[{0}])
> 03-03  Project(SomeId=[$0])
> {noformat}
> The string id happens to be of the following type: 
> {noformat}
> e4b4388e8865819126cb0e4dcaa7261d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4119) Skew in hash distribution for varchar (and possibly other) types of data

2015-11-21 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020638#comment-15020638
 ] 

Jacques Nadeau commented on DRILL-4119:
---

Interesting finding. As we've been stung by issues around hash functions 
before, it seems like we need to have a hash distribution test suite, 
especially when we make these kinds of changes. Each time we have an issue, 
then we can add that to the suite. I know one of the issues we had before was 
hashing null with another value (which we fixed with chaining). I can't 
remember what other issues we've had.

Your proposal seems reasonable.

> Skew in hash distribution for varchar (and possibly other) types of data
> 
>
> Key: DRILL-4119
> URL: https://issues.apache.org/jira/browse/DRILL-4119
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.3.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> We are seeing substantial skew for an Id column that contains varchar data of 
> length 32.   It is easily reproducible by a group-by query: 
> {noformat}
> Explain plan for SELECT SomeId From table GROUP BY SomeId;
> ...
> 01-02  HashAgg(group=[{0}])
> 01-03Project(SomeId=[$0])
> 01-04  HashToRandomExchange(dist0=[[$0]])
> 02-01UnorderedMuxExchange
> 03-01  Project(SomeId=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
> 03-02HashAgg(group=[{0}])
> 03-03  Project(SomeId=[$0])
> {noformat}
> The string id happens to be of the following type: 
> {noformat}
> e4b4388e8865819126cb0e4dcaa7261d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4119) Skew in hash distribution for varchar (and possibly other) types of data

2015-11-21 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020627#comment-15020627
 ] 

Aman Sinha commented on DRILL-4119:
---

The problem comes from the cast to integer after computing the hash value:  
castInt(hash64AsDouble($0)).   I verified that the hash64AsDouble produces good 
distribution for the hash value but the cast loses the precision.   The 
hash-based operators all use a 32 bit hash value (for smaller memory footprint 
and related reasons), so we do need the integer value but should preserve as 
much as possible the underlying distribution. 

I am fixing this by ensuring that instead of casting to int,  the underlying 
hash function itself computes a 32 bit hash value by first computing the 64 bit 
hash followed by XORing the most significant 4 bytes with the least significant 
4 bytes.   The current  hash32 functions in XXHash.java (for example, see 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/XXHash.java#L198)
 are currently calling hash64 and then casting to int.  I am proposing to 
change these to use the above mechanism of combining the msb and lsb bytes.  
The cpu cost should be relatively small.  

> Skew in hash distribution for varchar (and possibly other) types of data
> 
>
> Key: DRILL-4119
> URL: https://issues.apache.org/jira/browse/DRILL-4119
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.3.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> We are seeing substantial skew for an Id column that contains varchar data of 
> length 32.   It is easily reproducible by a group-by query: 
> {noformat}
> Explain plan for SELECT SomeId From table GROUP BY SomeId;
> ...
> 01-02  HashAgg(group=[{0}])
> 01-03Project(SomeId=[$0])
> 01-04  HashToRandomExchange(dist0=[[$0]])
> 02-01UnorderedMuxExchange
> 03-01  Project(SomeId=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
> 03-02HashAgg(group=[{0}])
> 03-03  Project(SomeId=[$0])
> {noformat}
> The string id happens to be of the following type: 
> {noformat}
> e4b4388e8865819126cb0e4dcaa7261d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4119) Skew in hash distribution for varchar (and possibly other) types of data

2015-11-21 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-4119:
-

 Summary: Skew in hash distribution for varchar (and possibly 
other) types of data
 Key: DRILL-4119
 URL: https://issues.apache.org/jira/browse/DRILL-4119
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.3.0
Reporter: Aman Sinha
Assignee: Aman Sinha


We are seeing substantial skew for an Id column that contains varchar data of 
length 32.   It is easily reproducible by a group-by query: 
{noformat}
Explain plan for SELECT SomeId From table GROUP BY SomeId;
...
01-02  HashAgg(group=[{0}])
01-03Project(SomeId=[$0])
01-04  HashToRandomExchange(dist0=[[$0]])
02-01UnorderedMuxExchange
03-01  Project(SomeId=[$0], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02HashAgg(group=[{0}])
03-03  Project(SomeId=[$0])
{noformat}

The string id happens to be of the following type: 
{noformat}
e4b4388e8865819126cb0e4dcaa7261d
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4070) Files written with versions of Drill before v1.3 record metadata that is indistinguishable from bad metadata from other Parquet creators

2015-11-21 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau resolved DRILL-4070.
---
Resolution: Won't Fix

Workaround provided in comments by Parth.

> Files written with versions of Drill before v1.3 record metadata that is 
> indistinguishable from bad metadata from other Parquet creators
> 
>
> Key: DRILL-4070
> URL: https://issues.apache.org/jira/browse/DRILL-4070
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Parth Chandra
>Priority: Blocker
> Fix For: 1.3.0
>
> Attachments: cache.txt, fewtypes_varcharpartition.tar.tgz
>
>
> Drill uses the parquet-mr library to write Parquet files. The metadata 
> signature that Drill produced in 1.2 and earlier versions of Drill is 
> indistinguishable from older footers written by other tools (such as Pig and 
> Hive). There was a known bug when those tools wrote metadata that caused the 
> statistics to be incorrect. To correct this, the parquet-mr library adopted a 
> behavior of ignoring statistics from the old form of the Parquet footer. 
> With 1.3, Drill upgraded to the latest version of parquet-mr and has now 
> started ignoring these statistics as well. This ensures correct result but 
> produces performance regressions (compared to Drill v1 and v2) when querying 
> against partitioned Parquet files generated in Drill 1.1 and 1.2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4087) Error parsing JSON - Invalid numeric value: Leading zeroes not allowed

2015-11-21 Thread Julian Hyde (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde closed DRILL-4087.
--
Resolution: Invalid

Changed the resolution from fixed to invalid - there was never a problem with 
drill. 

> Error parsing JSON - Invalid numeric value: Leading zeroes not allowed
> --
>
> Key: DRILL-4087
> URL: https://issues.apache.org/jira/browse/DRILL-4087
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Hadoop 2.7.1 cluster running on AWS staging instance 
> t4.medium 
> Apahe Dril - 1.2.0
>Reporter: Shankar
>
> jdbc:drill:> SELECT count(`timestamp`) FROM dfs.`/tmp/drill-s/` limit 10;
> Error: DATA_READ ERROR: Error parsing JSON - Invalid numeric value: Leading 
> zeroes not allowed
> is there any solution for this error ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-4087) Error parsing JSON - Invalid numeric value: Leading zeroes not allowed

2015-11-21 Thread Julian Hyde (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde reopened DRILL-4087:


> Error parsing JSON - Invalid numeric value: Leading zeroes not allowed
> --
>
> Key: DRILL-4087
> URL: https://issues.apache.org/jira/browse/DRILL-4087
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Hadoop 2.7.1 cluster running on AWS staging instance 
> t4.medium 
> Apahe Dril - 1.2.0
>Reporter: Shankar
>
> jdbc:drill:> SELECT count(`timestamp`) FROM dfs.`/tmp/drill-s/` limit 10;
> Error: DATA_READ ERROR: Error parsing JSON - Invalid numeric value: Leading 
> zeroes not allowed
> is there any solution for this error ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4118) Drillbit Foreman shutdowns while executing complex query on large amount of data

2015-11-21 Thread Shankar (JIRA)
Shankar created DRILL-4118:
--

 Summary: Drillbit Foreman shutdowns while executing complex query 
on large amount of data
 Key: DRILL-4118
 URL: https://issues.apache.org/jira/browse/DRILL-4118
 Project: Apache Drill
  Issue Type: Test
Affects Versions: 1.2.0
Reporter: Shankar


h4.{color:DarkCyan}*System config for POC:*{color}
* Servers => AWS instances
* Total Servers => 3
* instance Type => c4.xlarge
* vCPU => 4
* Memory => 7.5 GB
* Storage Type => EBS
* OS => CentOS-6.6 ( x64 architecture)

h4.{color:DarkCyan}*Data :*{color}
* DataSize = 15 GB GZ compressed ( equivalent of 150GB of uncompressed data)
* Type of Data => json format (1 json per line)
* Persistent storage => HDFS
* Data Frequency => 1 Day data only. ( file names are divided by each hour)

h4.{color:DarkCyan}*How we setup Apache drill :*{color}
# Version = Apache Drill 1.2.0
# Setup using default configurations on all 3 nodes.
# used Drill shell to query.
# Drill Web-Console to analyze the queries.


h4.{color:green}*Query-1 (total counts):*{color}
We had run simple query for *1 hour data*.Below is the query :

 - select count(`timestamp`) from dfs.`/tmp/hadoop/20151120-10.json.gz`

- Query has taken something around 120 seconds and it ran successfully.
- cpu load => 1.5 (on an avg per node)
- memory used => 3gb (on an avg per node)



h4.{color:green}*Query-2 (distinct counts)  :*{color}
We had run simple query for *1 hour data*.Below is the query :

- select count( distinct `timestamp`) from dfs.`/tmp/hadoop/20151120-10.json.gz`

-  Query has taken something around 200 seconds and it ran successfully.
- cpu load => 5.5 (on an avg per node)
- memory used => 3.9gb (on an avg per node)



h4.{color:green}*Query-3 (create table using filter)  :*{color}

We had run simple query for *1 day data*.Below is the query :

- create table tmp as select col1, col2 from dfs.`/tmp/hadoop`
where col like '%filter-text%'

- All columns are string in natures.
- Query has taken something around 340 seconds and it ran successfully.
- cpu load => 6.2 (on an avg per node)
- memory used => 4.2gb (on an avg per node)



h4.{color:red}*Query-4 (complex query with filters) :*{color}
We had run query for *1 day data*.Below is the query :

select
count( distinct case when col like '%filter-text%' then sessions end ) as 
new_col_01,
count( distinct case when col like '%filter-text%' then sessions end ) as 
new_col_02,
--
--
--
count( distinct case when col like '%filter-text%' then sessions end ) as 
new_col_15
from dfs.`/tmp/hadoop`

-- All columns are string in natures.
-- filters conditions are different for each count clauses.
-- {color:red}from drill shell => *seemed query were still running*{color}
-- {color:red}from logs => *drillbit Foreman shutdown*{color}
- cpu load => *85.x* (on an avg per node)
- memory used => *6.6gb* (on an avg per node)

{color:red}=> Error from Log file of drillbit Foreman node{color}


2015-11-20 18:53:59,185 [29b058ba-2c2c-2c7b-d380-00fb51af47c2:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 41ms total, 41.774180ms avg, 41ms max.
2015-11-20 18:53:59,185 [29b058ba-2c2c-2c7b-d380-00fb51af47c2:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 7.217000 μs, Latest start: 7.217000 μs, Average 
start: 7.217000 μs .
2015-11-20 19:06:07,320 [Drillbit-ShutdownHook#0] INFO  
o.apache.drill.exec.server.Drillbit - Received shutdown request.






h4.*Questions are:*

# Could you please tell me solution for above error ?
# Does drill-bit is needed high end servers to process large amount of data ?
# Does drill bit works well if we scale our servers horizontally with low 
system configurations (say 4 virtual CPU's, 8gb memory) and process large 
amount of data?
# Does drill bit works well if we scale our servers horizontally with low 
system configurations (say 8 virtual CPU's, 16gb memory) and process large 
amount of data?
# And finally please provide me the well tuned configuration. 







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4087) Error parsing JSON - Invalid numeric value: Leading zeroes not allowed

2015-11-21 Thread Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shankar resolved DRILL-4087.

Resolution: Fixed

> Error parsing JSON - Invalid numeric value: Leading zeroes not allowed
> --
>
> Key: DRILL-4087
> URL: https://issues.apache.org/jira/browse/DRILL-4087
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Hadoop 2.7.1 cluster running on AWS staging instance 
> t4.medium 
> Apahe Dril - 1.2.0
>Reporter: Shankar
>
> jdbc:drill:> SELECT count(`timestamp`) FROM dfs.`/tmp/drill-s/` limit 10;
> Error: DATA_READ ERROR: Error parsing JSON - Invalid numeric value: Leading 
> zeroes not allowed
> is there any solution for this error ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4087) Error parsing JSON - Invalid numeric value: Leading zeroes not allowed

2015-11-21 Thread Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020459#comment-15020459
 ] 

Shankar commented on DRILL-4087:


Thanks. This has been solved my issue.

> Error parsing JSON - Invalid numeric value: Leading zeroes not allowed
> --
>
> Key: DRILL-4087
> URL: https://issues.apache.org/jira/browse/DRILL-4087
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Hadoop 2.7.1 cluster running on AWS staging instance 
> t4.medium 
> Apahe Dril - 1.2.0
>Reporter: Shankar
>
> jdbc:drill:> SELECT count(`timestamp`) FROM dfs.`/tmp/drill-s/` limit 10;
> Error: DATA_READ ERROR: Error parsing JSON - Invalid numeric value: Leading 
> zeroes not allowed
> is there any solution for this error ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)