[ 
https://issues.apache.org/jira/browse/TAJO-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182434#comment-14182434
 ] 

Jihoon Son commented on TAJO-1123:
----------------------------------

Hi [~babokim], thanks for your work.
This will be a truly necessary. Actually, I also thought about this issue. IMO, 
the extensibility of the Fragment is very important to support various storage 
types in the future.
So, I have some questions about how you will go.

1. On the design of Fragment interface. I've looked over your patch, and found 
the changed interface. I wonder that this design is enough to represent the 
various types of storage. 

As you intended in this issue, the detailed characteristics of storage layer 
should be abstracted in the higher layer such as the query planner. That is, in 
the higher layer, all fragments can be handled in the same way regardless of 
the storage type. However, storages of different types have diverse 
characteristics, so different attributes can be important in each store. For 
example, HDFS is a distributed file system, so the start offset and the length 
are important information. In HBase, the start and end values of a region can 
be important. 

So, IMO, it is important that how we can provide the storage's necessary 
attributes to the above layer. Do you have any good idea?

2. On the design of the protocol buffer interface of the fragment. To support 
various types of storage, we need to design the protocol buffer interface of 
the fragment to be extendable. This issue is not included in your patch, but I 
wonder your opinion.

> Use Fragment instead of FileFragment.
> -------------------------------------
>
>                 Key: TAJO-1123
>                 URL: https://issues.apache.org/jira/browse/TAJO-1123
>             Project: Tajo
>          Issue Type: Sub-task
>            Reporter: Hyoungjun Kim
>            Assignee: Hyoungjun Kim
>            Priority: Minor
>
> Currently most operator and planner uses FileFragment object for splitting 
> data. FileFragment only has a information about a scanning target file. In 
> order to support various storage this should be changed to the abstract 
> object 'Fragment'.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to