[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-23 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r413987657



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   When unoptimised both forms contain a checked_add, it is just hard to 
see as the foreach+closure isn't inlined, and the codegen for the iterator form 
is significantly worse in general. When optimised the codegen is identical for 
both forms.
   
   Given this, I'm inclined to optimise for readability and use the simpler 
form, but ultimately it is up to you as the maintainers :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-23 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r413969298



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   Thanks, I was not aware of this particular compiler bug. In this 
particular case the codegen at opt-level 2 or higher is identical, I think the 
problem as described [here](https://github.com/rust-lang/rust/issues/35981) 
only manifests on nested for loops.
   
   I can update if necessary but error propagation out of the closure if pretty 
clunky...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then increase the recorded element count by n, which would leave things in an 
odd state. 
   
   For the general case there is little reason for this method to exist, aside 
from reserving memory once. It mainly exists because the bitmap specialization 
can more efficiently append the same value multiple times than a naive loop 
such as this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then increase the recorded element count by n, which would leave things in an 
odd state. 
   
   For the general case there is little reason for this method to exist, aside 
from reserving memory once. It mainly exists because the bitmap specialization 
can more efficiently append the same value multiple times than a naive loop 
such as this.
   
   Edit: Checked again and you're completely correct, there is a type 
specialization for the buffer writer - will fix





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then increase the recorded element count by n, which would leave things in an 
odd state. 
   
   For the general case there is little reason for this method to exist, aside 
from reserving memory once. It mainly exists because the bitmap specialization 
can more efficiently append the same value multiple times than a naive loop 
such as this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then increase the recorded element count by n, which would leave things in an 
odd state. 
   
   For the general case there is no reason for this method to exist, it purely 
exists because the bitmap specialization can more efficiently append the same 
value multiple times than a naive loop such as this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then update the recorded element count by n, which would leave things in an odd 
state.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   That would be incorrect though? The method should write the value `v` 
`n` times. Calling `write_bytes(v.to_byte_slice(), n)` would write `v` once and 
then update the array length by n, which would leave things in an odd state.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-20 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r411675427



##
File path: rust/arrow/src/array/builder.rs
##
@@ -236,6 +251,14 @@ impl BufferBuilderTrait for 
BufferBuilder {
 self.write_bytes(v.to_byte_slice(), 1)
 }
 
+default fn append_n( self, n: usize, v: T::Native) -> Result<()> {
+self.reserve(n)?;
+for _ in 0..n {
+self.write_bytes(v.to_byte_slice(), 1)?;
+}

Review comment:
   I'm not sure I understand what you mean?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-20 Thread GitBox


tustvold commented on a change in pull request #6980:
URL: https://github.com/apache/arrow/pull/6980#discussion_r411174949



##
File path: rust/arrow/src/array/builder.rs
##
@@ -301,6 +324,21 @@ impl BufferBuilderTrait for 
BufferBuilder {
 Ok(())
 }
 
+fn append_n( self, n: usize, v: bool) -> Result<()> {
+self.reserve(n)?;
+if v {
+unsafe {
+bit_util::set_bits_raw(
+self.buffer.raw_data() as *mut u8,

Review comment:
   Changed, and fixed the others in the same file





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org