[ 
https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291844#comment-17291844
 ] 

Neal Richardson commented on ARROW-11735:
-----------------------------------------

See r/data-raw/codegen.R for how this comes together (docs in comments at the 
top of the file), and look at the bottom of r/src/filesystem.cpp for an example 
of how this is working for s3. Changes to make:

* Make sure that everything that touches things in the parquet C++ namespace 
(C++ classes that are {{parquet::something}}, header {{#include}} statements of 
parquet paths) are inside {{#if defined(ARROW_R_WITH_PARQUET)}}
* Change the {{// [[arrow::export]]}} annotations on those functions to be {{// 
[[parquet::export]]}}
* Add "parquet" to the features list at the top of data-raw/codegen.R
* This is the tricky part: in configure and configure.win, do a check similar 
to how we look for whether {{ARROW_S3}} was enabled in the C++ build, and if so 
add {{-DARROW_R_WITH_PARQUET}} to PKG_CFLAGS. This will be trickier on Windows 
because I don't think the rwinlib bundle includes ArrowOptions.cmake (but it 
will always have Parquet enabled). An alternative (perhaps more robust 
approach) would be to follow the TEST_CMD approach and test for parquet/s3 
headers (current L179 of configure).
* In configure.win, you may also need to massage the PKG_LIBS since it has 
hard-coded that parquet and thrift are present (as they are in the rwinlib 
bundles). This is perhaps a separate issue from this (since this issue is 
particularly about Solaris and that's about doing different C++ dev builds on 
Windows.
* Once you've done this, the build should succeed without Parquet, but tests 
that involve Parquet will error. Revise skip_if_not_available() in 
r/tests/testthat/helper-skip.R to check {{arrow_with_parquet()}} if feature == 
"parquet" (this function is generated by codegen.R), then add 
{{skip_if_not_available("parquet")}} to all relevant tests.
* Search for {{arrow_with_s3()}} in the code and see where else we should do 
the same with arrow_with_parquet(). Among the uses:
* Add parquet to arrow_info() alongside where arrow_with_s3() is checked
* Wrap any parquet doc examples {{if (arrow_with_parquet())}}

> [R] Allow parquet to be an optional component like S3
> -----------------------------------------------------
>
>                 Key: ARROW-11735
>                 URL: https://issues.apache.org/jira/browse/ARROW-11735
>             Project: Apache Arrow
>          Issue Type: Sub-task
>          Components: R
>            Reporter: Neal Richardson
>            Priority: Major
>             Fix For: 4.0.0
>
>
> Parquet requires thrift and it seems that thrift (at least as of version 
> 0.12) does not compile on Solaris:
> {code}
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:
>  In member function virtual void 
> apache::thrift::transport::THttpServer::parseHeader(char*):
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:50:74:
>  error: strcasestr was not declared in this scope
>    #define THRIFT_strcasestr(haystack, needle) strcasestr(haystack, needle)
>                                                                           ^
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:62:9:
>  note: in expansion of macro THRIFT_strcasestr
>      if (THRIFT_strcasestr(value, "chunked") != NULL) {
> {code}
> (along with some boost endian header deprecation warnings)
> We could debug/patch that, or we could also make Parquet an optional feature 
> in the R bindings. That might have some value anyway so that one could build 
> a lighter/minimal R package, if that were helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to