[ 
https://issues.apache.org/jira/browse/CARBONDATA-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bo Xu updated CARBONDATA-3271:
------------------------------
    Description: 
Apache CarbonData should provides python interface to support deep learning 
framework to ready and write data from/to CarbonData, like TensorFlow , MXNet, 
PyTorch and so on. 

Basic framework:

Supports shuffle read, which reads the data in random order when feeding data 
to training model for each epoch.
Supports data cache to improve reading speed for multiple epoch, including 
local-disk and memory-cache.
Supports parallel reading using thread pool and process pool in python.
Supports reading data in object storage
Supports manifest format and CarbonData folder
AI compute engine integration:

Tensorflow integration:      
    New python API in pycarbon to support TensorFlow to read data from 
CarbonData files for training model



  was:
Apache CarbonData should provides python interface to support deep learning 
framework to ready and write data from/to CarbonData, like TensorFlow , MXNet, 
PyTorch and so on. It should not dependency Apache Spark.

Goals:
1. CarbonData provides python interface to support TensorFlow to ready data 
from CarbonData for training model
2. CarbonData provides python interface to support MXNet to ready data from 
CarbonData for training model
3. CarbonData provides python interface to support PyTorch to ready data from 
CarbonData for training model
4. CarbonData should support epoch function
5. CarbonData should support cache for speed up performance.



        Summary: Apache CarbonData should provides python interface to support 
deep learning framework TensorFlow to ready and write data from/to CarbonData  
(was: Apache CarbonData should provides python interface to support deep 
learning framework to ready and write data from/to CarbonData)

> Apache CarbonData should provides python interface to support deep learning 
> framework TensorFlow to ready and write data from/to CarbonData
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-3271
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3271
>             Project: CarbonData
>          Issue Type: Sub-task
>            Reporter: Bo Xu
>            Assignee: Bo Xu
>            Priority: Major
>             Fix For: 2.0.0
>
>          Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Apache CarbonData should provides python interface to support deep learning 
> framework to ready and write data from/to CarbonData, like TensorFlow , 
> MXNet, PyTorch and so on. 
> Basic framework:
> Supports shuffle read, which reads the data in random order when feeding data 
> to training model for each epoch.
> Supports data cache to improve reading speed for multiple epoch, including 
> local-disk and memory-cache.
> Supports parallel reading using thread pool and process pool in python.
> Supports reading data in object storage
> Supports manifest format and CarbonData folder
> AI compute engine integration:
> Tensorflow integration:      
>     New python API in pycarbon to support TensorFlow to read data from 
> CarbonData files for training model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to