[GitHub] tinkerpop pull request #875: TINKERPOP-1967 Add a connectedComponent() step ...

2018-06-13 Thread vtslab
Github user vtslab closed the pull request at:

https://github.com/apache/tinkerpop/pull/875


---


[GitHub] tinkerpop issue #875: TINKERPOP-1967 Add a connectedComponent() step - vtsla...

2018-06-13 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/875
  
PR is malformed due to problems in master. I close this one and prepare a 
new PR with the same commits.


---


[GitHub] tinkerpop pull request #877: Tinkerpop 1967 Add a connectedComponent() step ...

2018-06-13 Thread vtslab
GitHub user vtslab opened a pull request:

https://github.com/apache/tinkerpop/pull/877

Tinkerpop 1967 Add a connectedComponent() step - vtslab contribution2

This merges the new OLAP step into a corrected version of the old recipe. I 
did not adapt the release-update file, which is no longer consistent with the 
sutiation (leave that to you). I made a comment in the TINKERPOP-1967 branch 
proper about the gremlin-console import section.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vtslab/incubator-tinkerpop 
TINKERPOP-1967-vtslab2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/877.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #877


commit 28d4b02660f3f5c682538acaf4768218d9a8b40a
Author: HadoopMarc 
Date:   2018-05-21T12:03:54Z

Merged vtslab recipe for connected components

commit b087822708707013f7f0cd3b5abaf6d0f574a72e
Author: HadoopMarc 
Date:   2018-06-10T13:17:17Z

Extended the connected-components recipe




---


[GitHub] tinkerpop pull request #875: TINKERPOP-1967 Add a connectedComponent() step ...

2018-06-10 Thread vtslab
GitHub user vtslab opened a pull request:

https://github.com/apache/tinkerpop/pull/875

TINKERPOP-1967  Add a connectedComponent() step - vtslab contribution

This merges the new OLAP step into a corrected version of the old recipe. I 
did not adapt the release-update file, which is no longer consistent with the 
sutiation (leave that to you). I made a comment in the TINKERPOP-1967 branch 
proper about the gremlin-console import section.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vtslab/incubator-tinkerpop 
TINKERPOP-1967-vtslab

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/875.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #875


commit f91d3d9d21f1e921a071200668b0fd0e33b321a8
Author: Stephen Mallette 
Date:   2018-05-17T18:44:01Z

TINKERPOP-1967 Added connectedComponent() step

Deprecated the recipe for "Connected Components" but left the old content 
present as I felt it had educational value.

commit 552cc228f4b86ddf76c250667466291ece2fc705
Author: HadoopMarc 
Date:   2018-05-21T12:03:54Z

    Merged vtslab recipe for connected components

commit 657ffd423df093909fc861a079280e9e2b94f100
Author: HadoopMarc 
Date:   2018-06-10T13:17:17Z

Extended the connected-components recipe




---


[GitHub] tinkerpop pull request #:

2018-06-10 Thread vtslab
Github user vtslab commented on the pull request:


https://github.com/apache/tinkerpop/commit/06fdb7d1ac319c7f8afc115a54aa535e8078039b#commitcomment-29309765
  
You might want to include the ConnectedComponentVertexProgram into the 
imports section of gremlin-console, as was done for the other available vertex 
programs.


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-18 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
I am fine with the PR now. Build server needs a check, though.


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-17 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
Please do not merge yet, I just noticed two wrong links. I will correct 
this later in the week together with the tp33/master branch.

Marc


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-16 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
Thanks @pluradj for going the extra mile.


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-12 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
Hi @spmallette,
No problem, but it is unclear to me whether JIRA will also add items to the 
list then, apart from adding the section headings. Where will my two change 
items appear, for which item(s) should I add the JIRA issue number? 
And for the TP3.3 line (the other PR), should the two new items be in the 
3.2 section, the 3.3 section, in both or in none?
Cheers,Marc


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-12 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
Your welcome. Thanks all for the initial suggestions on the dev list and 
the review comments above.


---


[GitHub] tinkerpop pull request #721: TINKERPOP-1786 Recipe and missing manifest item...

2017-10-02 Thread vtslab
Github user vtslab commented on a diff in the pull request:

https://github.com/apache/tinkerpop/pull/721#discussion_r142223415
  
--- Diff: hadoop-gremlin/conf/hadoop-gryo.properties ---
@@ -29,8 +29,8 @@ gremlin.hadoop.outputLocation=output
 spark.master=local[4]
 spark.executor.memory=1g
 
spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
+gremlin.spark.persistContext=true
--- End diff --

Good question, I had not justified this yet. My original reason was that 
stopping both the SparkContext and the gremlin console as in the docs 
generation, can lead to race conditions in spark-yarn with random connection 
exceptions showing up in the console output in the docs. But as a bonus, 
follow-up OLAP queries get answered much faster as you skip the overhead for 
getting resources from yarn. This is what is also done in Apache Zeppelin, 
Spark shell and the like.

The alternative is to set the property in the console together with the 
other properties. This would require some more explanation and configuration 
work afterwards to/from the recipe users, but would leave the properties file 
untouched. I like the current proposal better, but I am fine with both.


---


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-09-26 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
Could I assume a pseudo-hadoop cluster present during the integration test 
phase? I thought only the asciidoc processing had that. Anyway, breaking the 
spark-yarn option will be noticed through the docs processing, but hopefully 
not until hadoop-3 or spark-3.
I´ll correct the TinkerPop naming, off course, and will also add a pointer 
to the gremlin-plugin-dependencies section of the spark-gremlin manifest file.


---


[GitHub] tinkerpop pull request #722: TINKERPOP-1786 Recipe and missing manifest item...

2017-09-24 Thread vtslab
GitHub user vtslab opened a pull request:

https://github.com/apache/tinkerpop/pull/722

TINKERPOP-1786 Recipe and missing manifest items for Spark on Yarn (TP33)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vtslab/incubator-tinkerpop 
spark-yarn-recipe-tp33

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #722


commit eaf8be65c09e1d03a59fdba677a6609147193e7c
Author: HadoopMarc <vts...@xs4all.nl>
Date:   2017-09-10T12:45:45Z

Added spark-yarn recipe and missing manifest items in spark-gremlin

commit 97cba2bbe5e732e70f3f13d309b2b0ce3cf26067
Author: HadoopMarc <vts...@xs4all.nl>
Date:   2017-09-20T06:12:48Z

Changes relative to tp32 to get spark-2.2 on yarn working




---


[GitHub] tinkerpop pull request #721: TINKERPOP-1786 Recipe and missing manifest item...

2017-09-24 Thread vtslab
GitHub user vtslab opened a pull request:

https://github.com/apache/tinkerpop/pull/721

TINKERPOP-1786 Recipe and missing manifest items for Spark on Yarn (TP32)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vtslab/incubator-tinkerpop 
spark-yarn-recipe-tp32

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/721.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #721


commit 250042b66b49d73619f7f25177c7ce755202e337
Author: HadoopMarc <vts...@xs4all.nl>
Date:   2017-09-10T12:45:45Z

Added spark-yarn recipe and missing manifest items in spark-gremlin




---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-02-25 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
A good catch by @robertdale indeed, about the bytecode requests. I was not 
even aware those were human readable. A proposal: include the audit logging of 
bytecode requests in a new Jira ticket that addresses authentication using 
gremlin-python. This, because the integrate test setup will look similar.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #534: TINKERPOP-1566 Kerberos authentication for grem...

2017-02-21 Thread vtslab
Github user vtslab commented on a diff in the pull request:

https://github.com/apache/tinkerpop/pull/534#discussion_r102313638
  
--- Diff: 
gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/HttpBasicAuthenticationHandler.java
 ---
@@ -92,6 +102,13 @@ public void channelRead(final ChannelHandlerContext 
ctx, final Object msg) {
 try {
 authenticator.authenticate(credentials);
 ctx.fireChannelRead(request);
+
+// User name logged with the remote socket address and 
authenticator classname for audit logging
+if (authenticationSettings.enableAuditLog) {
+String[] authClassParts = 
authenticator.getClass().toString().split("[.]");
+auditLogger.info("User {} with address {} 
authenticated by {}", credentials.get(PROPERTY_USERNAME),
+
ctx.channel().remoteAddress().toString().substring(1), 
authClassParts[authClassParts.length - 1]);
--- End diff --

Thanks for clarifying, I will correct it. Tests are still running.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #534: TINKERPOP-1566 Kerberos authentication for grem...

2017-02-20 Thread vtslab
Github user vtslab commented on a diff in the pull request:

https://github.com/apache/tinkerpop/pull/534#discussion_r102095814
  
--- Diff: docs/src/reference/gremlin-applications.asciidoc ---
@@ -1035,6 +1035,7 @@ The following table describes the various YAML 
configuration options that Gremli
 |=
 |Key |Description |Default
 |authentication.className |The fully qualified classname of an 
`Authenticator` implementation to use.  If this setting is not present, then 
authentication is effectively disabled. |`AllowAllAuthenticator`
+|authentication.enableAuditLog |The available authenticators can issue 
audit logging messages, binding the authenticated user to his remote socket 
address and binding requests with a gremlin query to the remote socket address. 
For privacy reasons, the default value of this setting is false. The audit 
logging messages are logged at the INFO level via the 
`audit.org.apache.tinkerpop.gremlin.server` logger, which can be configured 
using the log4j.properties file. |false
--- End diff --

Answer to this riddle (needed some thought on my part): also other config 
items, like scriptEngines..config, are at the bottom of their section. 
Just because it looks more orderly in the config file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-02-19 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
Thanks, Stephen, for guiding me through this so far. Once this gets merged, 
I'd like to take a look at authentication with the python driver.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-02-02 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
I interpreted almosta week of silence as consensus on the gremlin-driver 
behavior, so I made the following changes:
- restored gremlin-driver's Handler (checkout from master)
- added a ToDo to gremlin-driver's Handler regarding receiving allowed 
mechanisms from gremlin-server
- added GSSException fail options to three tests that required so
- corrected the changelog


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


gremlin-python test SimpleAuthenticator

2017-01-30 Thread vtslab


While trying to get gremlin-python to work with the proposed Kerberos 
authenticator for gremlin-server 
(https://github.com/apache/tinkerpop/pull/534), I noticed that 
gremlin-python's pom.xml starts a gremlin-server with 
SimpleAuthenticator (port 45941) but no tests seem to use it, nor are 
the user= and password= arguments of the driver_remote_connection.py 
ever tested. Is this still work in progress or do I miss something?


Possibly related to this, @davebshow remarks about the recenntly 
closed/merged issue https://issues.apache.org/jira/browse/TINKERPOP-1600 
, that all gremlin-python remote_driver_connection tests pass. Does this 
include username/password authentication for which the netty handler 
coding possibly changed due to TINKERPOP-1600?


Cheers,   Marc


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-01-26 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
Hi @mike-tr-adamson, I am glad you entered the discussion. I think your 
main point is valid, namely that there are circumstances, pointed out by you, 
when gremlin-driver should select the GSSAPI mechanism even though no 
JAAS_ENTRY is specified (ToDo: make a test for this to safeguard the desired 
behavior).
Having said this, the old behavior (select GSSAPI out of the blue if no 
username/password is supplied) also has its risks and problems given the 
multitude of SASL mechanisms that people could want to use, see 
[http://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml](http://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml).
 Ideally, one would want gremlin-server to provide a token with the 
mechanism(s) it supports, so that gremlin-driver can use this to instantiate 
the SaslClient properly. 
In your case, with `javax.security.auth.useSubjectCredsOnly=false` 
configured, you would have a Gremlin-Server with a Krb5Authenticator 
configured, the server would provide the GSSAPI token in its authentication 
request and gremlin-driver would know to select the GSSAPI mechanism. 
However, this ideal situation requires more changes to the gremlin-driver 
and gremlin-server code. 
I could live now with adding the GSSException as an option to the tests 
with your explanation how it could be a valid option. This solves the current 
challenge and we can add this discussion as comments to the code for future 
reference, when requirements for other SASL mechanisms pop up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-01-20 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
OK, thanks, I'll see to it.  The downside of the third option was implicit: 
it changes the driver code which is in production everywhere. But that´s why 
we do this work in the 3.3.x line and I think it will be an imrpovement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-01-20 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
Hi, I am working on the two failing integration tests. It can be fixed in 
three ways:
 1. just hide the symptoms and also allow a GSSException for a test that 
should fail anyway:   ugly!
 2. configure false for the failsafe plugin: fine, but may impact 
test performance
 3. adapt the gremlin-driver handler code, which now chooses GSSAPI if no 
credentials are supplied. This should be a test on an available JaasEntry.

Which of the ones do you like to see committed on the PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #534: TINKERPOP-1566 Kerberos authentication for gremlin-ser...

2017-01-18 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/534
  
Yes, I mentioned these failing tests in the PR text above. I suspect that 
the java security libs just pick up the krb5.conf file from the test resources 
"without asking". It means the test fails faster and tries to find credentials 
elsewhere when not provided. To resolve it, we either have to adapt the test or 
we have to try to configure Kerberos without a krb5.conf file (if that is the 
cause). Either way, it does not file nice if test outcomes depend on 
unnecessary components being on the classpath or not, so maybe we should look 
for a third way.

PR title should be OK now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #533: TINKERPOP-1600 Added base64 encoded string to sasl cha...

2017-01-17 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/533
  
I agree with your explanation that a byte array returned from gremlin 
server as the result of a query does not crash gremlin driver (and I also 
checked it manually). Sorry for the confusion introduced.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #533: TINKERPOP-1600 Added base64 encoded string to sasl cha...

2017-01-17 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/533
  
Just undoing [this 
commit](https://github.com/apache/tinkerpop/pull/534/commits/62648242c6576b020d2dd2933b89b9d69e87fed0)
 in TINKERPOP-1566 and merging in TINKERPOP-1600 does not work for me, see 
below. I did not dig in yet, maybe you recognize what is happening. Other tests 
without serializeResultToString configured fail the same way. Btw, in my 
testing branch I renamed GremlinServerAuthKrb5IntegrateTest to 
GremlinServerAuthKrb5Test for faster testing.

Tests run: 8, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 34.711 sec 
<<< FAILURE! - in org.apache.tinkerpop.gremlin.server.GremlinServerAuthKrb5Test

shouldAuthenticateWithSerializeResultToString(org.apache.tinkerpop.gremlin.server.GremlinServerAuthKrb5Test)
  Time elapsed: 5.611 sec  <<< ERROR!
java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: Illegal base64 character 5b
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.tinkerpop.gremlin.server.GremlinServerAuthKrb5Test.shouldAuthenticateWithSerializeResultToString(GremlinServerAuthKrb5Test.java:220)
Caused by: java.lang.IllegalArgumentException: Illegal base64 character 5b
at java.util.Base64$Decoder.decode0(Base64.java:714)
at java.util.Base64$Decoder.decode(Base64.java:526)
at java.util.Base64$Decoder.decode(Base64.java:549)
at 
org.apache.tinkerpop.gremlin.driver.Handler$GremlinSaslAuthenticationHandler.channelRead0(Handler.java:119)
at 
org.apache.tinkerpop.gremlin.driver.Handler$GremlinSaslAuthenticationHandler.channelRead0(Handler.java:67)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #533: TINKERPOP-1600 Added base64 encoded string to sasl cha...

2017-01-16 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/533
  
While this solves the issue for byte[] returned from Sasl, I can still 
crash the driver by adding a byte[] as a vertex property and ask for the result:
gremlin> g.V(1).property('test1', 'test1' as byte[])
==>v[1]
gremlin> g.V(1).values('test1')
==>[116, 101, 115, 116, 49]
gremlin> g.V(1).values('test1').next().getClass()
==>class [B
OK, this is a pathological example, why configure serializeResultToString 
if you want a byte[] returned... BTW, what is this serializeResultToString 
meant for anyway?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #533: TINKERPOP-1600 Added base64 encoded string to sasl cha...

2017-01-16 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/533
  
This attacks the same issue as I had in PR TINKERPOP-1566 Kerberos:

[https://github.com/apache/tinkerpop/pull/534/commits/62648242c6576b020d2dd2933b89b9d69e87fed0]
I am fine with your solution in TINKERPOP-1600. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #534: Tinkerpop 1566

2017-01-16 Thread vtslab
GitHub user vtslab opened a pull request:

https://github.com/apache/tinkerpop/pull/534

Tinkerpop 1566

This PR includes three items (as stated in the changelog):
  1 Added Kerberos authentication to `gremlin-server` for websockets and 
nio transport.
  2 Added audit logging of authenticated users and of gremlin queries to 
`gremlin-server`.
  3 Fixed `gremlin-driver`'s support for string results regarding returned 
byte arrays 
from `Sasl` authentication.

Regarding item 1, I did not attempt to provide Kerberos authentication for 
http 
transport, as I assumed http will not be very popular anymore, now that the 
GLV's
are available for accessing graphs via gremlin-server.

Item 2, audit logging, naturally belongs to Kerberos authentication. 
Kerberos is
important in providing access to confidential data, that is, being sure of
someone's identity without having him logging in for each service. Some 
confidential data, like personal data, often have legal obligations 
regarding 
logging of their access: that is what item 2 provides.

Item 3 is just a minor issue that surfaced during test development of 
Kerberos 
authentication.

An ample number of integration tests is provided. In addition, I did manual 
tests
in a working freeIPA Kerberos environment to verify the proper working. 

Reviewers wanting a short reminder of Kerberos authenticationb are referred 
to:
http://www.roguelynn.com/words/explain-like-im-5-kerberos/
[It learnt me a lot, I am not trying to be arrogant :-)]

The main design choices I made are:

i) Krb5Authenticator does not refer to policy servers or storage backends 
for 
authorization, but rather assumes that any user who can be authenticated 
using
Kerberos, is also authorized to access the service. Others could extend on 
this.

ii) The JAAS entry for Krb5Authenticator was not made configurable, apart 
from 
the principal name and keytab location to be provided in the yaml file. 
Using 
a separate JAAS config file would primarily introduce more flexibility in 
getting 
the config wrong. This choice is in line with the current situation with all
authenticator configuration in gremlin-server's yaml file. But the choice 
is 
not consistent with other Apache projects like Hadoop and HBase.

iii) Audit logging was made into a general feature that also works for 
other 
authenticators. It has to be explicitly enabled, though, with a property in 
gremlin-server's yaml file, because the audit logs can contain confidential 
data.

iv) Audit logging was given a separate logger apart from the 
org.apache.tinkerpop
naming tree, so that its level can be set to INFO without influencing level
settings of the normal logging. The logger name, 
"audit.org.apache.tinkerpop.gremlin.server",
was defined in GremlinServer, for lack of a better location.

v) Apache Kerby was used as the Kerberos Key Distribution Center (KDC) for 
the 
Kerberos integration tests, because it also belongs to Apache and proved 
easy to 
use. The project is still in RC2 status, though, but is only a test 
dependency.

Finally, running integration test on gremlin-server still results in two 
errors,
probably due to the presence of the gremlinjaas.conf file in the test 
resources.
I did not correct these, because I was not sure whether it could be due to 
my
test environment.

Failed tests: 
  
GremlinServerAuthIntegrateTest.shouldFailAuthenticateWithPlainTextNoCredentials:130
 
expected: 
but was:
  
GremlinServerAuthOldIntegrateTest.shouldFailAuthenticateWithPlainTextNoCredentials:133
 
expected: 
but was:


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vtslab/incubator-tinkerpop TINKERPOP-1566

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/534.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #534


commit f09546e8bf61d0fc5ab5d92568ee9e7b6e86773e
Author: vtslab <vts...@xs4all.nl>
Date:   2016-11-25T10:48:32Z

Kerberos authenticator files added

commit 0f43649f86fcf049a5d2749387f832d2ed71fa9f
Author: vtslab <vts...@xs4all.nl>
Date:   2016-11-28T12:16:51Z

Added failing test shouldAuthenticateWithSerializeResultToString

commit d874207fb2f53139cc131012a866b3c271a0f73f
Author: HadoopMarc <vts...@xs4all.nl>
Date:   2016-12-03T15:41:47Z

Fixed problem with non-lowercase hostname

commit debb7c854a9a3042f091f5751182b2f563151f1e
Author: HadoopMarc <vts...@xs4all.nl>
Date:   2016-12-04T15:14:05Z

Added Kerberos tests for client

commit cb81fcfbc968c09bee4ac4ab059adf4782df037d
Author: HadoopMarc