[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-29 Thread Leemoonsoo
Github user Leemoonsoo commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
Merge to master if there're no further discussions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-26 Thread m30m
Github user m30m commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
I'm not sure whether it's a good idea to hide this complexity in a special 
way and I should check whether these changes are backward compatible. So I 
guess a doc-only PR, with a JIRA issue afterwards to handle some spark special 
types is a better solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-25 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
well, it's a lot quicker to get doc-only PR in :)
besides we should have a JIRA for changes like this. It's your call, @m30m 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-25 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
If we want to support the feature I mentioned I above in another PR, then 
the document here is useless because we have to update the doc later. So it 
would be better to do it in this PR IMHO. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-25 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
Let's keep this as documentation only and let's open a JIRA (another PR) 
for the DataFrame support?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
Yes, and you also need to update method `__getitem__` so that user don't 
need to construct DataFrame as this. `z.get("myScalaDataFrame")` should return 
DataFrame directly
```
myScalaDataFrame = DataFrame(z.get("myScalaDataFrame"), sqlContext)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread m30m
Github user m30m commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
Yes, that's a good idea. Shall I add a commit to this branch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
I mean we can internally do this in `PyZeppelinContext` as following:
```
def __setitem__(self, key, item):
if isinstance(item, DataFrame):
   self.z.put(key, item._jdf)
else:
   self.z.put(key, item)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread m30m
Github user m30m commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
It's not possible to put the DataFrame directly because of this error:
```Exception: Traceback (most recent call last):
  File 
"/spark-2.0.1-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py",
 line 1124, in __call__
args_command, temp_args = self._build_args(*args)

  File 
"/spark-2.0.1-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py",
 line 1094, in _build_args
[get_command_part(arg, self.pool) for arg in new_args])

  File 
"/spark-2.0.1-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", 
line 289, in get_command_part
command_part = REFERENCE_TYPE + parameter._get_object_id()

  File "/spark-2.0.1-bin-hadoop2.7/python/pyspark/sql/dataframe.py", line 
841, in __getattr__
"'%s' object has no attribute '%s'" % (self.__class__.__name__, name))

AttributeError: 'DataFrame' object has no attribute '_get_object_id'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
Should we do it implicitly for user in `ZeppelinContext`? Because I feel 
the syntax is not easy to understand if user don't know the internal 
implementation of pyspark. And I think we should not expose such internal 
things to users.

```
z.put("myPythonDataFrame", postsDf._jdf)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zeppelin issue #1677: Add doc for exchanging data frames

2016-11-24 Thread Leemoonsoo
Github user Leemoonsoo commented on the issue:

https://github.com/apache/zeppelin/pull/1677
  
@m30m Awesome!

LGTM and merge to master if there're no more comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---