Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-14 Thread Ji Liu
My original thoughts is that introduce a new interface makes the hierarchy a 
little confused (FixedListVector->BaseListVector, 
ListVector->BaseRepeatedVector->BaseListVector) and should try to avoid 
introducing new classes.

And you are right, FixedSizeListVector should not include offsetBuffer, I am 
fine with the approach you suggested, BaseRepeatedVector/ListVector and 
FixedSizeListVector both inherit from new interface(BaseListVector) they can be 
used interchangeably in dictionary encoding.Let see if others have different 
opinions.


Thanks,
Ji Liu


--
From:Micah Kornfield 
Send Time:2019年8月14日(星期三) 15:10
To:Ji Liu 
Cc:dev 
Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

You are right, the mainly difference between FixSizedListVector and ListVector 
is the offsetBuffer, but I think this could be avoided through 
allocateNewSafe() overwrite which calls allocateOffsetBuffer() in 
BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector will 
remain allocator.getEmpty(). 

I think there other methods that FixedSizeList shouldn't be implementing that 
are on List as well.   In an ideal world, I think the parent class/interface 
would be called ListVector and there would then be specific children of 
FixedSizeList and VariableSizeList.  I think that is too big a change to 
something core, but we should try to keep the relationship in that shape, so we 
don't need to override methods just to throw NotSupportedExceptions.
On Sun, Aug 11, 2019 at 7:35 AM Ji Liu  wrote:

Thanks Jacques, to avoid complex call paths for getObject, should keep 
getObject for both classes. I'll also checked for other methods.

Thanks,
Ji Liu

--
From:Jacques Nadeau 
Send Time:2019年8月11日(星期日) 21:43
To:dev ; Ji Liu 
Cc:emkornfield 
Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

We tried to get away from this kind of back and forth with subclassing as
much as possible. (call getObject on base class which then calls getIndex
on child class which then calls something else on base class). I haven't
looked through the code but let's try to avoid having complex call paths
for the vectors.

On Sat, Aug 10, 2019 at 6:07 PM Ji Liu  wrote:

> Hi Micah, thanks for your suggestion.
> You are right, the mainly difference between FixSizedListVector and
> ListVector is the offsetBuffer, but I think this could be avoided through
> allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
> BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector
> will remain allocator.getEmpty().
>
> Meanwhile, we could add getStartIndex(int index)/getEndIndex(int index)
> API to handle read data logic respectively which could be used in
> getObject(int index) or encoding parts. What’s more, no new interface need
> to be introduced.
>
> What do you think?
>
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield 
> Send Time:2019年8月11日(星期日) 08:47
> To:dev ; Ji Liu 
> Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from
> ListVector
>
> Hi Ji Liu,
> I think have a common interface/base-class for the two makes sense (but
> don't have historical context) from a reading data perspective.
>
> I think the change would need to be something above
> BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
> offset buffer, and that field is contained on BaseRepeatedValueVector.
>
> Thanks,
> Micah
> On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  wrote:
> Hi, all
>
>  While working on the issue to implement dictionary-encoded subfields[1]
> [2], I found FixedSizeListVector not extends ListVector(Thanks Micah
> pointing this out and curious why implemented FixedSizeListVector this way
>  before). Since FixedSizeListVector is a specific case of ListVector,
> should we make former extends the latter to reduce the plenty duplicated
> logic in these two and writer/reader classes?
>
>
>  Thanks,
>  Ji Liu
>
>  [1]
> https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972
>
>


Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-14 Thread Micah Kornfield
>
> You are right, the mainly difference between FixSizedListVector and
> ListVector is the offsetBuffer, but I think this could be avoided through
> allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
> BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector
> will remain allocator.getEmpty().


I think there other methods that FixedSizeList shouldn't be implementing
that are on List as well.   In an ideal world, I think the parent
class/interface would be called ListVector and there would then be specific
children of FixedSizeList and VariableSizeList.  I think that is too big a
change to something core, but we should try to keep the relationship in
that shape, so we don't need to override methods just to throw
NotSupportedExceptions.

On Sun, Aug 11, 2019 at 7:35 AM Ji Liu  wrote:

> Thanks Jacques, to avoid complex call paths for getObject, should keep
> getObject for both classes. I'll also checked for other methods.
>
> Thanks,
> Ji Liu
>
> --
> From:Jacques Nadeau 
> Send Time:2019年8月11日(星期日) 21:43
> To:dev ; Ji Liu 
> Cc:emkornfield 
> Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from
> ListVector
>
> We tried to get away from this kind of back and forth with subclassing as
> much as possible. (call getObject on base class which then calls getIndex
> on child class which then calls something else on base class). I haven't
> looked through the code but let's try to avoid having complex call paths
> for the vectors.
>
> On Sat, Aug 10, 2019 at 6:07 PM Ji Liu  wrote:
>
> > Hi Micah, thanks for your suggestion.
> > You are right, the mainly difference between FixSizedListVector and
> > ListVector is the offsetBuffer, but I think this could be avoided through
> > allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
> > BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector
> > will remain allocator.getEmpty().
> >
> > Meanwhile, we could add getStartIndex(int index)/getEndIndex(int index)
> > API to handle read data logic respectively which could be used in
>
> > getObject(int index) or encoding parts. What’s more, no new interface need
> > to be introduced.
> >
> > What do you think?
> >
> >
> > Thanks,
> > Ji Liu
> >
> >
> > --
> > From:Micah Kornfield 
> > Send Time:2019年8月11日(星期日) 08:47
> > To:dev ; Ji Liu 
> > Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from
> > ListVector
> >
> > Hi Ji Liu,
> > I think have a common interface/base-class for the two makes sense (but
> > don't have historical context) from a reading data perspective.
> >
> > I think the change would need to be something above
> > BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
> > offset buffer, and that field is contained on BaseRepeatedValueVector.
> >
> > Thanks,
> > Micah
> > On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  > wrote:
> > Hi, all
> >
> >  While working on the issue to implement dictionary-encoded subfields[1]
> > [2], I found FixedSizeListVector not extends ListVector(Thanks Micah
>
> > pointing this out and curious why implemented FixedSizeListVector this way
> >  before). Since FixedSizeListVector is a specific case of ListVector,
> > should we make former extends the latter to reduce the plenty duplicated
> > logic in these two and writer/reader classes?
> >
> >
> >  Thanks,
> >  Ji Liu
> >
> >  [1]
> >
> https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972
> >
> >
>
>


Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-11 Thread Ji Liu
Thanks Jacques, to avoid complex call paths for getObject, should keep 
getObject for both classes. I'll also checked for other methods.

Thanks,
Ji Liu


--
From:Jacques Nadeau 
Send Time:2019年8月11日(星期日) 21:43
To:dev ; Ji Liu 
Cc:emkornfield 
Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

We tried to get away from this kind of back and forth with subclassing as
much as possible. (call getObject on base class which then calls getIndex
on child class which then calls something else on base class). I haven't
looked through the code but let's try to avoid having complex call paths
for the vectors.

On Sat, Aug 10, 2019 at 6:07 PM Ji Liu  wrote:

> Hi Micah, thanks for your suggestion.
> You are right, the mainly difference between FixSizedListVector and
> ListVector is the offsetBuffer, but I think this could be avoided through
> allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
> BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector
> will remain allocator.getEmpty().
>
> Meanwhile, we could add getStartIndex(int index)/getEndIndex(int index)
> API to handle read data logic respectively which could be used in
> getObject(int index) or encoding parts. What’s more, no new interface need
> to be introduced.
>
> What do you think?
>
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield 
> Send Time:2019年8月11日(星期日) 08:47
> To:dev ; Ji Liu 
> Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from
> ListVector
>
> Hi Ji Liu,
> I think have a common interface/base-class for the two makes sense (but
> don't have historical context) from a reading data perspective.
>
> I think the change would need to be something above
> BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
> offset buffer, and that field is contained on BaseRepeatedValueVector.
>
> Thanks,
> Micah
> On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  wrote:
> Hi, all
>
>  While working on the issue to implement dictionary-encoded subfields[1]
> [2], I found FixedSizeListVector not extends ListVector(Thanks Micah
> pointing this out and curious why implemented FixedSizeListVector this way
>  before). Since FixedSizeListVector is a specific case of ListVector,
> should we make former extends the latter to reduce the plenty duplicated
> logic in these two and writer/reader classes?
>
>
>  Thanks,
>  Ji Liu
>
>  [1]
> https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972
>
>


Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-11 Thread Jacques Nadeau
We tried to get away from this kind of back and forth with subclassing as
much as possible. (call getObject on base class which then calls getIndex
on child class which then calls something else on base class). I haven't
looked through the code but let's try to avoid having complex call paths
for the vectors.

On Sat, Aug 10, 2019 at 6:07 PM Ji Liu  wrote:

> Hi Micah, thanks for your suggestion.
> You are right, the mainly difference between FixSizedListVector and
> ListVector is the offsetBuffer, but I think this could be avoided through
> allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
> BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector
> will remain allocator.getEmpty().
>
> Meanwhile, we could add getStartIndex(int index)/getEndIndex(int index)
> API to handle read data logic respectively which could be used in
> getObject(int index) or encoding parts. What’s more, no new interface need
> to be introduced.
>
> What do you think?
>
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield 
> Send Time:2019年8月11日(星期日) 08:47
> To:dev ; Ji Liu 
> Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from
> ListVector
>
> Hi Ji Liu,
> I think have a common interface/base-class for the two makes sense (but
> don't have historical context) from a reading data perspective.
>
> I think the change would need to be something above
> BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
> offset buffer, and that field is contained on BaseRepeatedValueVector.
>
> Thanks,
> Micah
> On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  wrote:
> Hi, all
>
>  While working on the issue to implement dictionary-encoded subfields[1]
> [2], I found FixedSizeListVector not extends ListVector(Thanks Micah
> pointing this out and curious why implemented FixedSizeListVector this way
>  before). Since FixedSizeListVector is a specific case of ListVector,
> should we make former extends the latter to reduce the plenty duplicated
> logic in these two and writer/reader classes?
>
>
>  Thanks,
>  Ji Liu
>
>  [1]
> https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972
>
>


Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-10 Thread Ji Liu
Hi Micah, thanks for your suggestion. 
You are right, the mainly difference between FixSizedListVector and ListVector 
is the offsetBuffer, but I think this could be avoided through 
allocateNewSafe() overwrite which calls allocateOffsetBuffer() in 
BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector will 
remain allocator.getEmpty(). 

Meanwhile, we could add getStartIndex(int index)/getEndIndex(int index) API to 
handle read data logic respectively which could be used in getObject(int index) 
or encoding parts. What’s more, no new interface need to be introduced.

What do you think?


Thanks,
Ji Liu


--
From:Micah Kornfield 
Send Time:2019年8月11日(星期日) 08:47
To:dev ; Ji Liu 
Subject:Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

Hi Ji Liu,
I think have a common interface/base-class for the two makes sense (but don't 
have historical context) from a reading data perspective. 

I think the change would need to be something above BaseRepeatedValueVector, 
since the FixedSizeListVector doesn't contain an offset buffer, and that field 
is contained on BaseRepeatedValueVector.

Thanks,
Micah
On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  wrote:
Hi, all

 While working on the issue to implement dictionary-encoded subfields[1] [2], I 
found FixedSizeListVector not extends ListVector(Thanks Micah pointing this out 
and curious why implemented FixedSizeListVector this way 
 before). Since FixedSizeListVector is a specific case of ListVector, should we 
make former extends the latter to reduce the plenty duplicated logic in these 
two and writer/reader classes? 


 Thanks,
 Ji Liu

 [1] 
https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972



Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-10 Thread Micah Kornfield
Hi Ji Liu,
I think have a common interface/base-class for the two makes sense (but
don't have historical context) from a reading data perspective.

I think the change would need to be something above
BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
offset buffer, and that field is contained on BaseRepeatedValueVector.

Thanks,
Micah

On Sat, Aug 10, 2019 at 5:25 PM Ji Liu  wrote:

> Hi, all
>
> While working on the issue to implement dictionary-encoded subfields[1]
> [2], I found FixedSizeListVector not extends ListVector(Thanks Micah
> pointing this out and curious why implemented FixedSizeListVector this way
> before). Since FixedSizeListVector is a specific case of ListVector,
> should we make former extends the latter to reduce the plenty duplicated
> logic in these two and writer/reader classes?
>
>
> Thanks,
> Ji Liu
>
> [1]
> https://issues.apache.org/jira/browse/ARROW-1175[2]https://github.com/apache/arrow/pull/4972
>
>