Re: ZK lost connectivity issue on large cluster

2016-10-17 Thread Padma Penumarthy
Hi Francois,

It would be good to understand how increasing affinity_factor helped in your 
case
so we can better document and also use that knowledge to improve things in 
future release.

If you have two clusters,  it is not clear whether you had the problem on 12 
node cluster 
or 220 node cluster or both. Is the dataset same on both ? Is 
max_width_per_node=8 in both clusters ?

Increasing affinity factor will lower remote reads  by scheduling more 
fragments/doing more work
on nodes which have data available locally.  So, there seem to be some kind of 
non uniform 
data distribution for sure. It would be good if you can provide more details 
i.e. how the data is
distributed in the cluster and how the load on the nodes changed when affinity 
factor was increased.

Thanks,
Padma


> On Oct 14, 2016, at 6:45 PM, François Méthot  wrote:
> 
> We have  a 12 nodes cluster and a 220 nodes cluster, but they do not talk
> to each other. So Padma's analysis do not apply but thanks for your
> comments. Our goal had been to run Drill on the 220 nodes cluster after it
> proved worthy of it on the small cluster.
> 
> planner.width.max_per_node was eventually reduced to 2 when we were trying
> to figure this out, it would still fail. After we figured out the
> affinity_factor, we put it back to its original value and it would work
> fine.
> 
> 
> 
> Sudheesh: Indeed, The Zk/drill services use the same network on our bigger
> cluster.
> 
> potential improvements:
> - planner.affinity_factor should be better documented.
> - When ZK disconnected, the running queries systematically failed. When we
> disabled the ForemanException thrown in the QueryManager.
> drillbitUnregistered method, most of our query started to run successfully,
> we would sometime get Drillbit Disconnected error within the rpc work bus.
> It did confirm that we still had something on our network going on, but it
> also showed that the RPC bus between drillbits was more resilient to
> network hiccup. I could not prove it, but I think under certain condition,
> the ZK session gets recreated, which cause a Query Manager unregistered
> (query fail) and register call right after, but the RPC
> bus  would remains connected.
> 
> 
> We really appreciate your feedback and we hope to contribute to this great
> project in the future.
> Thanks
> Francois
> 
> 
> 
> 
> 
> 
> On Fri, Oct 14, 2016 at 3:00 PM, Padma Penumarthy 
> wrote:
> 
>> 
>> Seems like you have 215 nodes, but the data for your query is there on
>> only 12 nodes.
>> Drill tries to distribute the scan fragments across the cluster more
>> uniformly (trying to utilize all CPU resources).
>> That is why you have lot of remote reads going on and increasing affinity
>> factor eliminates running scan
>> fragments on the other (215-12) nodes.
>> 
>> you also mentioned planner.width.max_per_node is set to 8.
>> So, with increased affinity factor,  you have 8 scan fragments doing a lot
>> more work on these 12 nodes.
>> Still, you got 10X improvement. Seems like your network is the obvious
>> bottleneck. Is it a 10G or 1G ?
>> 
>> Also, increasing affinity factor helped in your case because there is no
>> data on other nodes.
>> But, if you have data non uniformly distributed across more nodes, you
>> might still have the problem.
>> 
>> Thanks,
>> Padma
>> 
>>> On Oct 14, 2016, at 11:18 AM, Sudheesh Katkam 
>> wrote:
>>> 
>>> Hi Francois,
>>> 
>>> Thank you for posting your findings! Glad to see a 10X improvement.
>>> 
>>> By increasing affinity factor, looks like Drill’s parallelizer is forced
>> to assign fragments on nodes with data i.e. with high favorability for data
>> locality.
>>> 
>>> Regarding the random disconnection, I agree with your guess that the
>> network bandwidth is being used up by remote reads which causes lags in
>> drillbit to ZooKeeper heartbeats (since these services use the same
>> network)? Maybe others can comment here.
>>> 
>>> Thank you,
>>> Sudheesh
>>> 
 On Oct 12, 2016, at 6:06 PM, François Méthot 
>> wrote:
 
 Hi,
 
 We finally got rid of this error. We have tried many, many things  (like
 modifying drill to ignore the error!), it ultimately came down to this
 change:
 
 from default
 planner.affinity_factor=1.2
 to
 planner.affinity_factor=100
 
 Basically this encourages fragment to only care about locally stored
>> files.
 We looked at the code that used that property and figured that 100 would
 have strong impact.
 
 What led us to this property is the fact that 1/4 of our fragments would
 take a lot more time to complete their scan, up to  10x the time of the
 fastest nodes.  On the slower nodes, Cloudera Manager would show very
>> low
 Disk IOPS with high Network IO compare to our faster nodes. We had
>> noticed
 that before but figured it would be some optimization to be done later
>> 

[GitHub] drill pull request #620: add the RDBMS to the schema types

2016-10-17 Thread lvxin1986
GitHub user lvxin1986 opened a pull request:

https://github.com/apache/drill/pull/620

add the RDBMS to the schema types

Add the RDBMS to the schema types which driver supports.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvxin1986/drill patch-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/620.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #620


commit 7ad464a9ad64a7f7b1e7113bfbc26e5f89c6e411
Author: alex.lvxin 
Date:   2016-10-18T03:00:39Z

add the RDBMS to the schema types 

Add the RDBMS to the schema types which driver supports.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #600: DRILL-4373: Drill and Hive have incompatible timest...

2016-10-17 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/600#discussion_r83761501
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java
 ---
@@ -45,4 +53,34 @@ public static int getIntFromLEBytes(byte[] input, int 
start) {
 }
 return out;
   }
+
+  /**
+   * Utilities for converting from parquet INT96 binary (impala, hive 
timestamp)
+   * to date time value. This utilizes the Joda library.
+   */
+  public static class NanoTimeUtils {
+
+public static final long NANOS_PER_DAY = TimeUnit.DAYS.toNanos(1);
+public static final long NANOS_PER_HOUR = TimeUnit.HOURS.toNanos(1);
+public static final long NANOS_PER_MINUTE = 
TimeUnit.MINUTES.toNanos(1);
+public static final long NANOS_PER_SECOND = 
TimeUnit.SECONDS.toNanos(1);
+public static final long NANOS_PER_MILLISECOND =  
TimeUnit.MILLISECONDS.toNanos(1);
+
+  /**
+   * @param binaryTimeStampValue
+   *  hive, impala timestamp values with nanoseconds precision
+   *  are stored in parquet Binary as INT96
+   *
+   * @return  the number of milliseconds since January 1, 1970, 00:00:00 
GMT
+   *  represented by @param binaryTimeStampValue .
+   */
+public static long getDateTimeValueFromBinary(Binary 
binaryTimeStampValue) {
+  NanoTime nt = NanoTime.fromBinary(binaryTimeStampValue);
+  int julianDay = nt.getJulianDay();
+  long nanosOfDay = nt.getTimeOfDayNanos();
+  return DateTimeUtils.fromJulianDay(julianDay-0.5d) + 
nanosOfDay/NANOS_PER_MILLISECOND;
--- End diff --

Sorry for the late reply. For some reason, I did not see these comments 
till now. 
About 1) Yes, you are correct. I just want the comments in 
ConvertFromImpalaTimestamp to be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-4950) Consume Spurious Empty Batches in JDBC

2016-10-17 Thread Sudheesh Katkam (JIRA)
Sudheesh Katkam created DRILL-4950:
--

 Summary: Consume Spurious Empty Batches in JDBC
 Key: DRILL-4950
 URL: https://issues.apache.org/jira/browse/DRILL-4950
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Reporter: Sudheesh Katkam
Assignee: Sudheesh Katkam
Priority: Blocker
 Fix For: 1.9.0


In 
[DrillCursor|https://github.com/apache/drill/blob/master/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java#L199],
 consume all empty batches, not just non-continuous empty batches. This results 
in query cancellation (from sqlline) and incomplete results.

Introduced (regression?) in DRILL-2548.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill issue #602: Improve Drill C++ connector

2016-10-17 Thread superbstreak
Github user superbstreak commented on the issue:

https://github.com/apache/drill/pull/602
  
**Mac Environment:**
OS: OSX 10.8.5
COMPILER=xcode5_1

**Libraries:**
Zookeeper: 3.4.6 (patched)
boost: 1.57.0
protobuf: 2.5.0rc1

**For libc++:**
.../contrib/native/client/CMakeLists.txt:
- Add: set(CMAKE_CXX_FLAGS "-stdlib=libc++") before 
if(CMAKE_COMPILER_IS_GNUCXX)
- Change to set(CMAKE_EXE_LINKER_FLAGS "-stdlib=libc++ -lrt -lpthread")
- Change to set(CMAKE_CXX_FLAGS "-fPIC -stdlib=libc++")

**And the minimal code changes required for zookeeper to work:**
#ifndef htonll   <- Add this
int64_t htonll(int64_t v);
#endif < Add this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #602: Improve Drill C++ connector

2016-10-17 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/602
  
OK. The Mac issue I see is caused by MacOS -  
https://issues.apache.org/jira/browse/ZOOKEEPER-2049
Just to confirm, what version of MacOS and Zookeeper did you build with?
Should we upgrade the client to use Zookeeper 3.4.7?





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #602: Improve Drill C++ connector

2016-10-17 Thread laurentgo
Github user laurentgo commented on the issue:

https://github.com/apache/drill/pull/602
  
My build environment on Mac is OS X 10.11.6 (El Capitan) and Zookeeper 
3.4.8 (brew package).

I don't think the client needs to be updated if people are using an older 
version of OS X


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #600: DRILL-4373: Drill and Hive have incompatible timest...

2016-10-17 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/600#discussion_r83710133
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -754,15 +764,45 @@ public void testImpalaParquetVarBinary_DictChange() 
throws Exception {
 compareParquetReadersColumnar("field_impala_ts", 
"cp.`parquet/int96_dict_change.parquet`");
   }
 
+  @Test
+  public void testImpalaParquetBinaryTimeStamp_DictChange() throws 
Exception {
+try {
+  test("alter session set %s = true", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP);
+  compareParquetReadersColumnar("field_impala_ts", 
"cp.`parquet/int96_dict_change.parquet`");
--- End diff --

1. Is it better to compare result with baseline columns and values from the 
file or it is ok to compare with `sqlBaselineQuery` and disabled new 
`PARQUET_READER_INT96_AS_TIMESTAMP` option?
2. In the process of investigating this test I found that the primitive 
data type of the column in the file `int96_dict_change.parquet`  is BINARY, not 
INT96.  
I am a little bit confused with this. Do we need convert this BINARY to 
TIMESTAMP as well?
CONVERT_FROM function with IMPALA_TIMESTAMP argument works properly for 
this field.
I will investigate a little more about does impala and hive can store 
timestamps into parquet BINARY. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-4949) Need better handling of empty parquet files

2016-10-17 Thread Krystal (JIRA)
Krystal created DRILL-4949:
--

 Summary: Need better handling of empty parquet files
 Key: DRILL-4949
 URL: https://issues.apache.org/jira/browse/DRILL-4949
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.9.0
Reporter: Krystal


I have an empty parquet file created from hive.  When I tried to query against 
this table I got "IllegalArgumentException".

{code}
select * from `test_dir/voter_empty`;
Error: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has no read 
entries assigned

  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: MinorFragmentId 0 has no read entries assigned
org.apache.drill.exec.work.foreman.Foreman.run():281
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745
  Caused By (java.lang.IllegalArgumentException) MinorFragmentId 0 has no read 
entries assigned
com.google.common.base.Preconditions.checkArgument():122
org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan():824
org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan():101
org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan():68
org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan():35
org.apache.drill.exec.physical.base.AbstractGroupScan.accept():63
org.apache.drill.exec.planner.fragment.Materializer.visitOp():102
org.apache.drill.exec.planner.fragment.Materializer.visitOp():35

org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject():79
org.apache.drill.exec.physical.config.Project.accept():51
org.apache.drill.exec.planner.fragment.Materializer.visitStore():82
org.apache.drill.exec.planner.fragment.Materializer.visitStore():35

org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen():202
org.apache.drill.exec.physical.config.Screen.accept():98

org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit():283
org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments():127
org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit():596
org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan():426
org.apache.drill.exec.work.foreman.Foreman.runSQL():1010
org.apache.drill.exec.work.foreman.Foreman.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745 (state=,code=0)
{code}

Either drill should block the query and display a user friendly error message 
or allow the query to run and return empty result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill issue #595: DRILL-4203: Parquet File. Date is stored wrongly

2016-10-17 Thread tushu1232
Github user tushu1232 commented on the issue:

https://github.com/apache/drill/pull/595
  
How can I get to test this fix .Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Topics for next Drill hangout (10/18/2016)

2016-10-17 Thread Jinfeng Ni
Hi everyone,

The next Drill hangout is tomorrow, Oct 18 2016, 10:00AM PDT. If you
have any suggestions for hangout topics, you can add them to this
thread. You can always join and bring up new topics at the last
minute, as we will collect the topics at the beginning of hangout.

Thank you,

Jinfeng


Drill 1.8.0 User Authentication with a custom authenticator

2016-10-17 Thread Sudip Mukherjee
Hi,
I'm using drill 1.8.0 and I have a custom authenticator implementation 
following below steps :
https://drill.apache.org/docs/configuring-user-authentication/

Implementing and Configuring a Custom Authenticator
Administrators can use the template provided here to develop and implement a 
custom username/password based authenticator.
Complete the following steps to build and implement a custom authenticator:


When I try to get the logged in user from schemaConfig.getUserName() it gives 
me process user name which was not the case while I was using drill 1.4.0.

Could you please help. I use the logged in user name to validate against a SOLR 
source (this storage plugin is not published)

Thanks,
Sudip
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**