[jira] [Created] (ARROW-12430) [C++] Support LZO compression
Haowei Yu created ARROW-12430: - Summary: [C++] Support LZO compression Key: ARROW-12430 URL: https://issues.apache.org/jira/browse/ARROW-12430 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Haowei Yu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12429) [C++] MergedGeneratorTestFixture is incorrectly instantiated
David Li created ARROW-12429: Summary: [C++] MergedGeneratorTestFixture is incorrectly instantiated Key: ARROW-12429 URL: https://issues.apache.org/jira/browse/ARROW-12429 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: David Li Assignee: David Li [https://gist.github.com/kou/868eaed328b348e45865747044044272#file-source-cpp-txt] Looks like the base class was accidentally instantiated instead of the actual test -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12428) [Python] pyarrow.parquet.read_* should use pre_buffer=True
David Li created ARROW-12428: Summary: [Python] pyarrow.parquet.read_* should use pre_buffer=True Key: ARROW-12428 URL: https://issues.apache.org/jira/browse/ARROW-12428 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: David Li Assignee: David Li Fix For: 5.0.0 If the user is synchronously reading a single file, we should try to read it as fast as possible. The one sticking point might be whether it's beneficial to enable this no matter the filesystem or whether we should try to only enable it on high-latency filesystems. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12427) [Rust][DataFusion] Reenable physical_optimizer::repartition::Repartition;
Andrew Lamb created ARROW-12427: --- Summary: [Rust][DataFusion] Reenable physical_optimizer::repartition::Repartition; Key: ARROW-12427 URL: https://issues.apache.org/jira/browse/ARROW-12427 Project: Apache Arrow Issue Type: Improvement Reporter: Andrew Lamb To fix https://issues.apache.org/jira/browse/ARROW-12421 We disabled the physical_optimizer::repartition::Repartition rule in https://github.com/apache/arrow/pull/10069 this ticket tracks finding the root cause of the CI test failure and reenabing physical_optimizer::repartition::Repartition; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12426) [Rust] Concatenating dictionaries ignores values
Raphael Taylor-Davies created ARROW-12426: - Summary: [Rust] Concatenating dictionaries ignores values Key: ARROW-12426 URL: https://issues.apache.org/jira/browse/ARROW-12426 Project: Apache Arrow Issue Type: Improvement Reporter: Raphael Taylor-Davies Assignee: Raphael Taylor-Davies Concatenating dictionaries ignores the values array, at best leading to incorrect data, but often leading to keys with indexes beyond the bounds of the values array -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12425) [Rust] new_null_array doesn't allocate keys buffer for dictionary arrays
Raphael Taylor-Davies created ARROW-12425: - Summary: [Rust] new_null_array doesn't allocate keys buffer for dictionary arrays Key: ARROW-12425 URL: https://issues.apache.org/jira/browse/ARROW-12425 Project: Apache Arrow Issue Type: Improvement Reporter: Raphael Taylor-Davies Assignee: Raphael Taylor-Davies -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12424) Add Schema Package
Matt Topol created ARROW-12424: -- Summary: Add Schema Package Key: ARROW-12424 URL: https://issues.apache.org/jira/browse/ARROW-12424 Project: Apache Arrow Issue Type: Sub-task Components: Go, Parquet Reporter: Matt Topol Assignee: Matt Topol Adding the ported code for the Schema module for Go Parquet library. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12423) Codecov badge in main Readme only applies to Rust
Dominik Moritz created ARROW-12423: -- Summary: Codecov badge in main Readme only applies to Rust Key: ARROW-12423 URL: https://issues.apache.org/jira/browse/ARROW-12423 Project: Apache Arrow Issue Type: Task Reporter: Dominik Moritz The badge in https://github.com/apache/arrow/blob/master/README.md links to https://app.codecov.io/gh/apache/arrow, which seems to only show the coverage for the Rust code. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12422) Add castVARCHAR for milliseconds
Rodrigo Jacomozzi de Bem created ARROW-12422: Summary: Add castVARCHAR for milliseconds Key: ARROW-12422 URL: https://issues.apache.org/jira/browse/ARROW-12422 Project: Apache Arrow Issue Type: New Feature Reporter: Rodrigo Jacomozzi de Bem -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12421) [Rust] [DataFusion] topk_query test fails in master
Andy Grove created ARROW-12421: -- Summary: [Rust] [DataFusion] topk_query test fails in master Key: ARROW-12421 URL: https://issues.apache.org/jira/browse/ARROW-12421 Project: Apache Arrow Issue Type: Bug Components: Rust - DataFusion Reporter: Andy Grove {code:java} Running target/debug/deps/user_defined_plan-6b63acb904117235running 3 tests test topk_plan ... ok test topk_query ... FAILED test normal_query ... okfailures: topk_query stdout thread 'topk_query' panicked at 'assertion failed: `(left == right)` left: `["+-+-+", "| customer_id | revenue |", "+-+-+", "| paul| 300 |", "| jorge | 200 |", "| andy| 150 |", "+-+-+"]`, right: `["++", "||", "++", "++"]`: output mismatch for Topk context. Expectedn +-+-+ | customer_id | revenue | +-+-+ | paul| 300 | | jorge | 200 | | andy| 150 | +-+-+Actual: ++ || ++ ++ ', datafusion/tests/user_defined_plan.rs:133:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12420) [C++/Dataset] Reading null columns as dictionary not longer possible
Uwe Korn created ARROW-12420: Summary: [C++/Dataset] Reading null columns as dictionary not longer possible Key: ARROW-12420 URL: https://issues.apache.org/jira/browse/ARROW-12420 Project: Apache Arrow Issue Type: Improvement Components: C++ Affects Versions: 4.0.0 Reporter: Uwe Korn Fix For: 4.0.0 Reading a dataset with a dictionary column where some of the files don't contain any data for that column (and thus are typed as null) broke with https://github.com/apache/arrow/pull/9532. It worked with the 3.0 release though and thus I would consider this a regression. This can be reproduced using the following Python snippet: {code} import pyarrow as pa import pyarrow.parquet as pq import pyarrow.dataset as ds table = pa.table({"a": [None, None]}) pq.write_table(table, "test.parquet") schema = pa.schema([pa.field("a", pa.dictionary(pa.int32(), pa.string()))]) fsds = ds.FileSystemDataset.from_paths( paths=["test.parquet"], schema=schema, format=pa.dataset.ParquetFileFormat(), filesystem=pa.fs.LocalFileSystem(), ) fsds.to_table() {code} The exception on master is currently: {code} --- ArrowNotImplementedError Traceback (most recent call last) in 6 filesystem=pa.fs.LocalFileSystem(), 7 ) > 8 fsds.to_table() ~/Development/arrow/python/pyarrow/_dataset.pyx in pyarrow._dataset.Dataset.to_table() 456 table : Table instance 457 """ --> 458 return self._scanner(**kwargs).to_table() 459 460 def head(self, int num_rows, **kwargs): ~/Development/arrow/python/pyarrow/_dataset.pyx in pyarrow._dataset.Scanner.to_table() 2887 result = self.scanner.ToTable() 2888 -> 2889 return pyarrow_wrap_table(GetResultValue(result)) 2890 2891 def take(self, object indices): ~/Development/arrow/python/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status() 139 cdef api int pyarrow_internal_check_status(const CStatus& status) \ 140 nogil except -1: --> 141 return check_status(status) 142 143 ~/Development/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() 116 raise ArrowKeyError(message) 117 elif status.IsNotImplemented(): --> 118 raise ArrowNotImplementedError(message) 119 elif status.IsTypeError(): 120 raise ArrowTypeError(message) ArrowNotImplementedError: Unsupported cast from null to dictionary (no available cast function for target type) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12419) [Java] flatc is not used in mvn
Kazuaki Ishizaki created ARROW-12419: Summary: [Java] flatc is not used in mvn Key: ARROW-12419 URL: https://issues.apache.org/jira/browse/ARROW-12419 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 4.0.0 Reporter: Kazuaki Ishizaki Assignee: Kazuaki Ishizaki ARROW-12111 removed the usage of flatc during the build process in mvn. Thus, it is not necessary to explicitly download flatc for s390x. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12418) 1Z0-1072 PDF - Become Oracle Certified With The Help Of Prepare4test
Andrew Sharon created ARROW-12418: - Summary: 1Z0-1072 PDF - Become Oracle Certified With The Help Of Prepare4test Key: ARROW-12418 URL: https://issues.apache.org/jira/browse/ARROW-12418 Project: Apache Arrow Issue Type: Task Reporter: Andrew Sharon *Take Up the 1Z0-1072 Exam For a Successful Career!* In order to prove your expertise in the Oracle Cloud Infrastructure 2019 Architect Associate Exam the best thing you could do is to take up the exam 1Z0-1072. This would bring instant fame to you and also prove that you are an Oracle Cloud expert. The passing score is decided by the Oracle and it is likely to change. You may refer to the Oracle website in order to find the correct passing score. There are many recommended 1Z0-1072 courses that you may take up for the Oracle Cloud Infrastructure 2019 Architect Associate Exam exam and these include Oracle Cloud services etc. Having this knowledge would help you to perform well in the 1Z0-1072 exam. All these training programs are offered by Oracle and you can make use of the online training option in order to get trained at home itself! The [Oracle Cloud exam|http://prepare4test.com/exam/1z0-1072-dumps/] syllabus would include topics like basic Oracle Cloud Infrastructure 2019 Architect Associate Exam etc. In case you failed to pass the 1Z0-1072 exam with the required percentage of marks, you could re attend the Oracle Cloud Infrastructure 2019 Architect Associate Exam exam. In order to prepare well for the Oracle Cloud exam, you could take up various coaching programs by the Oracle university. There are different types of programs which includes instructor led class, web based class. You could choose the appropriate program based on your convenience. There are also lots of 1Z0-1072 practice exams that are available online. You could take up these 1Z0-1072 practice tests in order to understand the Oracle Cloud Infrastructure 2019 Architect Associate Exam exam pattern in a better way! [!https://i.imgur.com/maE1HKX.jpg!|http://prepare4test.com/exam/1z0-1072-dumps/] {quote} {quote} *Why Oracle Cloud Infrastructure 2019 Architect Associate Exam training and certification?* IT professionals those who are Oracle Cloud training and certification holders boast a distinct advantage over other IT aspirants. Oracle 1Z0-1072 certification is valuable and globally recognized credential that prove the skills and expertise of the IT professionals. Oracle Cloud is the most innovative and top data base product, developed to handle the massive and continuously growing and expanding requirements of modern organizations at lower costs, with high quality standards. Oracle Cloud Infrastructure 2019 Architect Associate Exam certification bring forth the aspirants' level of knowledge and skills to create and maintain Oracle Cloud environment, etc. This is hence, can be considered as one of the highly respectable and viable Oracle certification in the industry. 1Z0-1072 IT professionals already working in the industry get benefited by being eligible to get a salary raise, also strengthen and create newer avenues in the job market and career hierarchy. -- This message was sent by Atlassian Jira (v8.3.4#803005)