[
https://issues.apache.org/jira/browse/ORC-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Owen O'Malley updated ORC-541:
------------------------------
Fix Version/s: (was: 1.5.7)
> Extend CHAR behavior to STRING
> ------------------------------
>
> Key: ORC-541
> URL: https://issues.apache.org/jira/browse/ORC-541
> Project: ORC
> Issue Type: Improvement
> Components: C++
> Affects Versions: 1.5.6
> Reporter: Jerry Adair
> Priority: Minor
>
> This issue is a dual-purpose animal of sorts; I'd like to offer a suggestion
> and a contribution to satisfy that suggestion, as well as to ask a question.
> The context is in regard to why the ORC types of CHAR and VARCHAR are
> processed differently from that of STRING. I'm guessing that there was a
> reason, but not certain as to what that reason might be.
>
> The specific area that I am addressing is in regard to the maxLength
> attribute of the TypeImpl class. With CHAR and VARCHAR, a user can define
> this maxLength attribute but with STRING they cannot. Granted, there is a
> "convenience method" if you will for only the CHAR class, thus:
> ORC_UNIQUE_PTR<Type> createCharType(TypeKind kind,
> uint64_t maxLength);
> In my lil' test program, I used this like so:
> container->addStructField( std::string( "char column" ), createCharType(
> orc::TypeKind::CHAR, 20 ) );
>
> So at a minimum it would seem that there should be an equivalent for the
> VARCHAR type. However I was able to "get crafty" and create one via the
> following:
> container->addStructField( std::string( "varchar column" ),
> std::unique_ptr<Type>(new TypeImpl(orc::TypeKind::VARCHAR, 20)));
>
> And both of these would produce a type of either char(20) or varchar(20) and
> the getMaximumLength() method would return a value of 20 as well.
>
> However, none of this works for the STRING type. As with VARCHAR, there is
> no "convenience method" and a similar attempt to that of the varchar shown
> above, thus:
> container->addStructField( std::string( "string column" ),
> std::unique_ptr<Type>(new TypeImpl(orc::TypeKind::STRING, 20)));
> failed to produce the result I would have expected. It was easy to see why
> the output type was just "string", that is readily seen in the toString()
> method. However I was a bit surprised to see that getMaximumLength returned
> 0 when I used the second variant of the TypeImpl constructor, ergo the one
> that has the maxLength set via the second parm.
>
> Unfortunately I didn't have time to dig into why that was happening, but I'd
> seen enough to warrant an issue report, albeit not of critical importance.
>
> All that said, as a user of ORC, I'd like to see the STRING type handled in
> the same manner as the CHAR or VARCHAR type, with convenience methods for
> both, as there is for CHAR. Or at least learn why there is only the one
> convenience method and why STRING is treated so differently. We could use
> this functionality in our project (in which we use ORC), and this is the
> reason I am opening the issue ticket in the first place.
>
> I'd be willing to contribute the fix, as it seems easy enough to do. But
> I'll leave that up to Owen or other project folk to decide.
>
> Thanks,
> Jerry
--
This message was sent by Atlassian Jira
(v8.3.4#803005)