[ 
https://issues.apache.org/jira/browse/ARROW-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421415#comment-16421415
 ] 

ASF GitHub Bot commented on ARROW-2361:
---------------------------------------

wesm closed pull request #1804: ARROW-2361: [Rust] Starting point for a native 
Rust implementation of Arrow
URL: https://github.com/apache/arrow/pull/1804
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/rust/.gitignore b/rust/.gitignore
new file mode 100644
index 000000000..fa8d85ac5
--- /dev/null
+++ b/rust/.gitignore
@@ -0,0 +1,2 @@
+Cargo.lock
+target
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
new file mode 100644
index 000000000..9b7b3102f
--- /dev/null
+++ b/rust/Cargo.toml
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+[package]
+name = "arrow"
+version = "0.1.0"
+description = "Rust implementation of Apache Arrow"
+homepage = "https://github.com/apache/arrow";
+repository = "https://github.com/apache/arrow";
+authors = ["Apache Arrow <[email protected]>"]
+license = "Apache-2.0"
+keywords = [ "arrow" ]
+include = [
+    "src/**/*.rs",
+    "Cargo.toml",
+]
+
+[lib]
+name = "arrow"
+path = "src/lib.rs"
+
+[dependencies]
+bytes = "0.4"
+libc = "0.2"
\ No newline at end of file
diff --git a/rust/README.md b/rust/README.md
new file mode 100644
index 000000000..cc50e80cb
--- /dev/null
+++ b/rust/README.md
@@ -0,0 +1,47 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Native Rust implementation of Apache Arrow
+
+## Status
+
+This is a starting point for a native Rust implementation of Arrow.
+
+The current code demonstrates arrays of primitive types and structs.
+
+Contiguous memory buffers are used but they are not aligned at 8-byte 
boundaries yet.
+
+## Example
+
+```rust
+let _schema = Schema::new(vec![
+    Field::new("a", DataType::Int32, false),
+    Field::new("b", DataType::Float32, false),
+]);
+
+let a = Rc::new(Array::from(vec![1,2,3,4,5]));
+let b = Rc::new(Array::from(vec![1.1, 2.2, 3.3, 4.4, 5.5]));
+let _ = Rc::new(Array::from(vec![a,b]));
+```
+
+## Run Tests
+
+```bash
+cargo test
+```
\ No newline at end of file
diff --git a/rust/src/array.rs b/rust/src/array.rs
new file mode 100644
index 000000000..15d89c3ab
--- /dev/null
+++ b/rust/src/array.rs
@@ -0,0 +1,373 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::convert::From;
+use std::iter::Iterator;
+use std::rc::Rc;
+use std::str;
+use std::string::String;
+
+use super::bitmap::Bitmap;
+use super::buffer::Buffer;
+use super::list::List;
+use super::error::*;
+
+pub enum ArrayData {
+    Boolean(Buffer<bool>),
+    Float32(Buffer<f32>),
+    Float64(Buffer<f64>),
+    Int8(Buffer<i8>),
+    Int16(Buffer<i16>),
+    Int32(Buffer<i32>),
+    Int64(Buffer<i64>),
+    UInt8(Buffer<u8>),
+    UInt16(Buffer<u16>),
+    UInt32(Buffer<u32>),
+    UInt64(Buffer<u64>),
+    Utf8(List<u8>),
+    Struct(Vec<Rc<Array>>)
+}
+
+macro_rules! arraydata_from_primitive {
+    ($DT:ty, $AT:ident) => {
+        impl From<Vec<$DT>> for ArrayData {
+            fn from(v: Vec<$DT>) -> Self {
+                ArrayData::$AT(Buffer::from(v))
+            }
+        }
+
+    }
+}
+
+arraydata_from_primitive!(bool, Boolean);
+arraydata_from_primitive!(f32, Float32);
+arraydata_from_primitive!(f64, Float64);
+arraydata_from_primitive!(i8, Int8);
+arraydata_from_primitive!(i16, Int16);
+arraydata_from_primitive!(i32, Int32);
+arraydata_from_primitive!(i64, Int64);
+arraydata_from_primitive!(u8, UInt8);
+arraydata_from_primitive!(u16, UInt16);
+arraydata_from_primitive!(u32, UInt32);
+arraydata_from_primitive!(u64, UInt64);
+
+pub struct Array {
+    pub len: i32,
+    pub null_count: i32,
+    pub validity_bitmap: Option<Bitmap>,
+    pub data: ArrayData
+}
+
+impl Array {
+
+    /// Create a new array where there are no null values
+    pub fn new(len: usize, data: ArrayData) -> Self {
+        Array { len: len as i32, data, validity_bitmap: None, null_count: 0 }
+    }
+
+    pub fn data(&self) -> &ArrayData {
+        &self.data
+    }
+
+    pub fn len(&self) -> usize {
+        self.len as usize
+    }
+
+}
+
+/// type-safe array operations
+trait ArrayOps<T> {
+    /// Get one element from an array. Note that this is an expensive call 
since it
+    /// will pattern match the type of the array on every invocation. We 
should add
+    /// other efficient iterator and map methods so we can perform columnar 
operations
+    /// instead.
+    fn get(&self, i: usize) -> Result<T,Error>;
+
+    /// Compare two same-typed arrays using a boolean closure e.g. eq, gt, lt, 
and so on
+    fn compare(&self, other: &Array, f: &Fn(T,T) -> bool) -> Result<Vec<bool>, 
Error>;
+
+    /// Perform a computation on two same-typed arrays and produce a result of 
the same type e.g. c = a + b
+    fn compute(&self, other: &Array, f: &Fn(T,T) -> T) -> Result<Vec<T>, 
Error>;
+}
+
+macro_rules! array_ops {
+    ($DT:ty, $AT:ident) => {
+        impl ArrayOps<$DT> for Array {
+            fn get(&self, i: usize) -> Result<$DT,Error> {
+                match self.data() {
+                    &ArrayData::$AT(ref buf) => Ok(unsafe 
{*buf.data().offset(i as isize)}),
+                    _ => Err(Error::from("Request for $DT but array is not 
$DT"))
+                }
+            }
+            fn compare(&self, other: &Array, f: &Fn($DT,$DT) -> bool) -> 
Result<Vec<bool>, Error> {
+                match (&self.data, &other.data) {
+                    (&ArrayData::$AT(ref l), &ArrayData::$AT(ref r)) => {
+                        let mut b: Vec<bool> = Vec::with_capacity(self.len as 
usize);
+                        for i in 0..self.len as isize {
+                            let lv : $DT = unsafe { *l.data().offset(i) };
+                            let rv : $DT = unsafe { *r.data().offset(i) };
+                            b.push(f(lv,rv));
+                        }
+                        Ok(b)
+                    },
+                    _ => Err(Error::from("Cannot compare arrays of this type"))
+                }
+            }
+            fn compute(&self, other: &Array, f: &Fn($DT,$DT) -> $DT) -> 
Result<Vec<$DT>, Error> {
+                match (&self.data, &other.data) {
+                    (&ArrayData::$AT(ref l), &ArrayData::$AT(ref r)) => {
+                        let mut b: Vec<$DT> = Vec::with_capacity(self.len as 
usize);
+                        for i in 0..self.len as isize {
+                            let lv : $DT = unsafe { *l.data().offset(i) };
+                            let rv : $DT = unsafe { *r.data().offset(i) };
+                            b.push(f(lv,rv));
+                        }
+                        Ok(b)
+                    },
+                    _ => Err(Error::from("Cannot compare arrays of this type"))
+                }
+            }
+        }
+    }
+}
+
+array_ops!(bool, Boolean);
+array_ops!(f64, Float64);
+array_ops!(f32, Float32);
+array_ops!(u8, UInt8);
+array_ops!(u16, UInt16);
+array_ops!(u32, UInt32);
+array_ops!(u64, UInt64);
+array_ops!(i8, Int8);
+array_ops!(i16, Int16);
+array_ops!(i32, Int32);
+array_ops!(i64, Int64);
+
+macro_rules! array_from_primitive {
+    ($DT:ty) => {
+        impl From<Vec<$DT>> for Array {
+            fn from(v: Vec<$DT>) -> Self {
+                Array { len: v.len() as i32, null_count: 0, validity_bitmap: 
None, data: ArrayData::from(v) }
+            }
+        }
+    }
+
+}
+
+array_from_primitive!(bool);
+array_from_primitive!(f32);
+array_from_primitive!(f64);
+array_from_primitive!(u16);
+array_from_primitive!(u32);
+array_from_primitive!(u64);
+array_from_primitive!(i8);
+array_from_primitive!(i16);
+array_from_primitive!(i32);
+array_from_primitive!(i64);
+
+macro_rules! array_from_optional_primitive {
+    ($DT:ty, $DEFAULT:expr) => {
+        impl From<Vec<Option<$DT>>> for Array {
+            fn from(v: Vec<Option<$DT>>) -> Self {
+                let mut null_count = 0;
+                let mut validity_bitmap = Bitmap::new(v.len());
+                for i in 0 .. v.len() {
+                    if v[i].is_none() {
+                        null_count+=1;
+                        validity_bitmap.clear(i);
+                    }
+                }
+                let values = v.iter().map(|x| 
x.unwrap_or($DEFAULT)).collect::<Vec<$DT>>();
+                Array { len: values.len() as i32, null_count, validity_bitmap: 
Some(validity_bitmap), data: ArrayData::from(values) }
+            }
+        }
+    }
+
+}
+
+array_from_optional_primitive!(bool, false);
+array_from_optional_primitive!(f32, 0_f32);
+array_from_optional_primitive!(f64, 0_f64);
+array_from_optional_primitive!(u16, 0_u16);
+array_from_optional_primitive!(u32, 0_u32);
+array_from_optional_primitive!(u64, 0_u64);
+array_from_optional_primitive!(i8, 0_i8);
+array_from_optional_primitive!(i16, 0_i16);
+array_from_optional_primitive!(i32, 0_i32);
+array_from_optional_primitive!(i64, 0_i64);
+
+/// This method mostly just used for unit tests
+impl From<Vec<&'static str>> for Array {
+    fn from(v: Vec<&'static str>) -> Self {
+        Array::from(v.iter().map(|s| s.to_string()).collect::<Vec<String>>())
+    }
+}
+
+impl From<Vec<String>> for Array {
+    fn from(v: Vec<String>) -> Self {
+        Array {
+            len: v.len() as i32,
+            null_count: 0,
+            validity_bitmap: None,
+            data: ArrayData::Utf8(List::from(v))
+        }
+    }
+}
+
+impl From<Vec<Rc<Array>>> for Array {
+    fn from(v: Vec<Rc<Array>>) -> Self {
+        Array {
+            len: v.len() as i32,
+            null_count: 0,
+            validity_bitmap: None,
+            data: ArrayData::Struct(v.iter().map(|a| a.clone()).collect())
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use super::super::datatypes::*;
+
+    #[test]
+    fn test_utf8_offsets() {
+        let a = Array::from(vec!["this", "is", "a", "test"]);
+        assert_eq!(4, a.len());
+        match a.data() {
+            &ArrayData::Utf8(List{ ref data, ref offsets }) => {
+                assert_eq!(11, data.len());
+                assert_eq!(0, *offsets.get(0));
+                assert_eq!(4, *offsets.get(1));
+                assert_eq!(6, *offsets.get(2));
+                assert_eq!(7, *offsets.get(3));
+                assert_eq!(11, *offsets.get(4));
+            },
+            _ => panic!()
+        }
+    }
+
+    #[test]
+    fn test_utf8_slices() {
+        let a = Array::from(vec!["this", "is", "a", "test"]);
+        match a.data() {
+            &ArrayData::Utf8(ref d) => {
+                assert_eq!(4, d.len());
+                assert_eq!("this", str::from_utf8(d.slice(0)).unwrap());
+                assert_eq!("is", str::from_utf8(d.slice(1)).unwrap());
+                assert_eq!("a", str::from_utf8(d.slice(2)).unwrap());
+                assert_eq!("test", str::from_utf8(d.slice(3)).unwrap());
+            },
+            _ => panic!()
+        }
+    }
+
+    #[test]
+    fn test_from_bool() {
+        let a = Array::from(vec![false, false, true, false]);
+        assert_eq!(4, a.len());
+    }
+
+    #[test]
+    fn test_from_f32() {
+        let a = Array::from(vec![1.23, 2.34, 3.45, 4.56]);
+        assert_eq!(4, a.len());
+    }
+
+    #[test]
+    fn test_from_i32() {
+
+        let a = Array::from(vec![15, 14, 13, 12, 11]);
+        assert_eq!(5, a.len());
+
+        assert_eq!(15, a.get(0).unwrap());
+        assert_eq!(14, a.get(1).unwrap());
+        assert_eq!(13, a.get(2).unwrap());
+        assert_eq!(12, a.get(3).unwrap());
+        assert_eq!(11, a.get(4).unwrap());
+    }
+
+    #[test]
+    fn test_from_optional_i32() {
+        let a = Array::from(vec![Some(1), None, Some(2), Some(3), None]);
+        assert_eq!(5, a.len());
+        // 1 == not null
+        let validity_bitmap = a.validity_bitmap.unwrap();
+        assert_eq!(true, validity_bitmap.is_set(0));
+        assert_eq!(false, validity_bitmap.is_set(1));
+        assert_eq!(true, validity_bitmap.is_set(2));
+        assert_eq!(true, validity_bitmap.is_set(3));
+        assert_eq!(false, validity_bitmap.is_set(4));
+    }
+
+    #[test]
+    fn test_struct() {
+
+        let _schema = Schema::new(vec![
+            Field::new("a", DataType::Int32, false),
+            Field::new("b", DataType::Float32, false),
+        ]);
+
+        let a = Rc::new(Array::from(vec![1,2,3,4,5]));
+        let b = Rc::new(Array::from(vec![1.1, 2.2, 3.3, 4.4, 5.5]));
+        let _ = Rc::new(Array::from(vec![a,b]));
+    }
+
+    #[test]
+    fn test_array_eq() {
+        let a = Array::from(vec![1,2,3,4,5]);
+        let b = Array::from(vec![5,4,3,2,1]);
+        let c = a.compare(&b, &|a: i32,b: i32| a == b).unwrap();
+        assert_eq!(c, vec![false,false,true,false,false]);
+    }
+
+    #[test]
+    fn test_array_lt() {
+        let a = Array::from(vec![1,2,3,4,5]);
+        let b = Array::from(vec![5,4,3,2,1]);
+        let c = a.compare(&b, &|a: i32,b: i32| a < b).unwrap();
+        assert_eq!(c, vec![true,true,false,false,false]);
+    }
+
+    #[test]
+    fn test_array_gt() {
+        let a = Array::from(vec![1,2,3,4,5]);
+        let b = Array::from(vec![5,4,3,2,1]);
+        let c = a.compare(&b, &|a: i32,b: i32| a > b).unwrap();
+        assert_eq!(c, vec![false,false,false,true,true]);
+    }
+
+    #[test]
+    fn test_array_add() {
+        let a = Array::from(vec![1,2,3,4,5]);
+        let b = Array::from(vec![5,4,3,2,1]);
+        let c = a.compute(&b, &|a: i32,b: i32| a + b).unwrap();
+        assert_eq!(c, vec![6,6,6,6,6]);
+    }
+
+    #[test]
+    fn test_array_multiply() {
+        let a = Array::from(vec![1,2,3,4,5]);
+        let b = Array::from(vec![5,4,3,2,1]);
+        let c = a.compute(&b, &|a: i32,b: i32| a * b).unwrap();
+        assert_eq!(c, vec![5,8,9,8,5]);
+    }
+}
+
+
+
+
diff --git a/rust/src/bitmap.rs b/rust/src/bitmap.rs
new file mode 100644
index 000000000..94c513a98
--- /dev/null
+++ b/rust/src/bitmap.rs
@@ -0,0 +1,85 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use super::buffer::Buffer;
+
+pub struct Bitmap {
+    bits: Buffer<u8>
+}
+
+impl Bitmap {
+
+    pub fn new(num_bits: usize) -> Self {
+        let num_bytes = num_bits/8 + if num_bits%8 > 0 { 1 } else { 0 };
+        let r = num_bytes % 64;
+        let len = if r==0 { num_bytes } else { num_bytes + 64-r };
+        let mut v = Vec::with_capacity(len);
+        for _ in 0 .. len {
+            v.push(255); // 1 is not null
+        }
+        Bitmap { bits: Buffer::from(v) }
+    }
+
+    pub fn len(&self) -> i32 {
+        self.bits.len()
+    }
+
+    pub fn is_set(&self, i: usize) -> bool {
+        let byte_offset = i / 8;
+        self.bits.get(byte_offset) & (1_u8 << ((i % 8) as u8)) > 0
+    }
+
+    pub fn set(&mut self, i: usize) {
+        let byte_offset = i / 8;
+        let v : u8 = {
+            self.bits.get(byte_offset) | (1_u8 << ((i % 8) as u8))
+        };
+        self.bits.set(byte_offset, v);
+    }
+
+    pub fn clear(&mut self, i: usize) {
+        let byte_offset = i / 8;
+        let v : u8 = self.bits.get(byte_offset) ^ (1_u8 << ((i % 8) as u8));
+        self.bits.set(byte_offset, v);
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_bitmap_length() {
+        assert_eq!(64, Bitmap::new(63*8).len());
+        assert_eq!(64, Bitmap::new(64*8).len());
+        assert_eq!(128, Bitmap::new(65*8).len());
+    }
+
+    #[test]
+    fn test_set_clear_bit() {
+        let mut b = Bitmap::new(64*8);
+        assert_eq!(true, b.is_set(12));
+        b.clear(12);
+        assert_eq!(false, b.is_set(12));
+        b.set(12);
+        assert_eq!(true, b.is_set(12));
+    }
+
+}
+
+
+
diff --git a/rust/src/buffer.rs b/rust/src/buffer.rs
new file mode 100644
index 000000000..f70e0e2cd
--- /dev/null
+++ b/rust/src/buffer.rs
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::mem;
+use std::slice;
+use libc;
+
+use super::memory::*;
+
+pub struct Buffer<T> {
+    data: *const T,
+    len: i32
+}
+
+impl<T> Buffer<T> {
+
+    pub fn new(data: *const T, len: i32) -> Self {
+        Buffer { data, len }
+    }
+
+    pub fn len(&self) -> i32 {
+        self.len
+    }
+
+    pub fn data(&self) -> *const T {
+        self.data
+    }
+
+    pub fn slice(&self, start: usize, end: usize) -> &[T] {
+        unsafe { slice::from_raw_parts(self.data.offset(start as isize), 
(end-start) as usize) }
+    }
+
+    pub fn get(&self, i: usize) -> &T {
+        unsafe { &(*self.data.offset(i as isize)) }
+    }
+
+    pub fn set(&mut self, i: usize, v: T) {
+        unsafe {
+            let p = mem::transmute::<*const T, *mut T>(self.data);
+            *p.offset(i as isize) = v;
+        }
+    }
+}
+
+macro_rules! array_from_primitive {
+    ($DT:ty) => {
+        impl From<Vec<$DT>> for Buffer<$DT> {
+            fn from(v: Vec<$DT>) -> Self {
+                // allocate aligned memory buffer
+                let len = v.len();
+                let sz = mem::size_of::<$DT>();
+                let buffer = allocate_aligned((len * sz) as i64).unwrap();
+                Buffer {
+                    len: len as i32,
+                    data: unsafe {
+                        let dst = mem::transmute::<*const u8, *mut 
libc::c_void>(buffer);
+                        libc::memcpy(dst, mem::transmute::<*const $DT, *const 
libc::c_void>(v.as_ptr()), len * sz);
+                        mem::transmute::<*mut libc::c_void, *const $DT>(dst)
+                    }
+                }
+            }
+        }
+    }
+}
+
+array_from_primitive!(bool);
+array_from_primitive!(f32);
+array_from_primitive!(f64);
+array_from_primitive!(u8);
+array_from_primitive!(u16);
+array_from_primitive!(u32);
+array_from_primitive!(u64);
+array_from_primitive!(i8);
+array_from_primitive!(i16);
+array_from_primitive!(i32);
+array_from_primitive!(i64);
+
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    #[test]
+    fn test_buffer_i32() {
+        let b: Buffer<i32> = Buffer::from(vec![1, 2, 3, 4, 5]);
+        assert_eq!(5, b.len);
+    }
+}
diff --git a/rust/src/datatypes.rs b/rust/src/datatypes.rs
new file mode 100644
index 000000000..cf104f372
--- /dev/null
+++ b/rust/src/datatypes.rs
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#[derive(Debug,Clone)]
+pub enum DataType {
+    Boolean,
+    Int8,
+    Int16,
+    Int32,
+    Int64,
+    UInt8,
+    UInt16,
+    UInt32,
+    UInt64,
+    Float32,
+    Float64,
+    Utf8,
+    Struct(Vec<Field>)
+}
+
+#[derive(Debug,Clone)]
+pub struct Field {
+    pub name: String,
+    pub data_type: DataType,
+    pub nullable: bool
+}
+
+impl Field {
+
+    pub fn new(name: &str, data_type: DataType, nullable: bool) -> Self {
+        Field {
+            name: name.to_string(),
+            data_type: data_type,
+            nullable: nullable
+        }
+    }
+
+    pub fn to_string(&self) -> String {
+        format!("{}: {:?}", self.name, self.data_type)
+    }
+}
+
+#[derive(Debug,Clone)]
+pub struct Schema {
+    pub columns: Vec<Field>
+}
+
+impl Schema {
+
+    /// create an empty schema
+    pub fn empty() -> Self { Schema { columns: vec![] } }
+
+    pub fn new(columns: Vec<Field>) -> Self { Schema { columns: columns } }
+
+    /// look up a column by name and return a reference to the column along 
with it's index
+    pub fn column(&self, name: &str) -> Option<(usize, &Field)> {
+        self.columns.iter()
+            .enumerate()
+            .find(|&(_,c)| c.name == name)
+    }
+
+    pub fn to_string(&self) -> String {
+        let s : Vec<String> = self.columns.iter()
+            .map(|c| c.to_string())
+            .collect();
+        s.join(",")
+    }
+
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_define_schema() {
+        let _person = Schema::new(vec![
+            Field::new("first_name", DataType::Utf8, false),
+            Field::new("last_name", DataType::Utf8, false),
+            Field::new("address", DataType::Struct(vec![
+                Field::new("street", DataType::Utf8, false),
+                Field::new("zip", DataType::UInt16, false),
+            ]), false),
+        ]);
+    }
+}
diff --git a/rust/src/error.rs b/rust/src/error.rs
new file mode 100644
index 000000000..837978049
--- /dev/null
+++ b/rust/src/error.rs
@@ -0,0 +1,29 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::convert::*;
+
+#[derive(Debug,Clone)]
+pub struct Error {
+    msg: String
+}
+
+impl From<&'static str> for Error where {
+    fn from(msg: &'static str) -> Self {
+        Error { msg: String::from(msg) }
+    }
+}
\ No newline at end of file
diff --git a/rust/src/lib.rs b/rust/src/lib.rs
new file mode 100644
index 000000000..b2caa6190
--- /dev/null
+++ b/rust/src/lib.rs
@@ -0,0 +1,28 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+extern crate bytes;
+extern crate libc;
+
+pub mod array;
+pub mod bitmap;
+pub mod buffer;
+pub mod datatypes;
+pub mod list;
+pub mod error;
+pub mod memory;
+
diff --git a/rust/src/list.rs b/rust/src/list.rs
new file mode 100644
index 000000000..a3a4e76fd
--- /dev/null
+++ b/rust/src/list.rs
@@ -0,0 +1,81 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::str;
+
+use bytes::{BytesMut, BufMut};
+
+use super::buffer::Buffer;
+
+pub struct List<T> {
+    pub data: Buffer<T>,
+    pub offsets: Buffer<i32>
+}
+
+impl<T> List<T> {
+
+    pub fn len(&self) -> i32 {
+        self.offsets.len()-1
+    }
+
+    pub fn slice(&self, index: usize) -> &[T] {
+        let start = *self.offsets.get(index) as usize;
+        let end = *self.offsets.get(index+1) as usize;
+        &self.data.slice(start, end)
+    }
+}
+
+impl From<Vec<String>> for List<u8> {
+    fn from(v: Vec<String>) -> Self {
+        let mut offsets : Vec<i32> = Vec::with_capacity(v.len() + 1);
+        let mut buf = BytesMut::with_capacity(v.len() * 32);
+        offsets.push(0_i32);
+        v.iter().for_each(|s| {
+            buf.put(s.as_bytes());
+            offsets.push(buf.len() as i32);
+        });
+        let bytes = buf.freeze();
+        let buffer = Buffer::new(bytes.as_ptr(), bytes.len() as i32);
+
+        List { data: buffer, offsets: Buffer::from(offsets) }
+    }
+}
+
+/// This method mostly just used for unit tests
+impl From<Vec<&'static str>> for List<u8> {
+    fn from(v: Vec<&'static str>) -> Self {
+        List::from(v.iter().map(|s| s.to_string()).collect::<Vec<String>>())
+    }
+}
+
+
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_utf8_slices() {
+        let list = List::from(vec!["this", "is", "a", "test"]);
+        assert_eq!(4, list.len());
+        assert_eq!("this", str::from_utf8(list.slice(0)).unwrap());
+        assert_eq!("is", str::from_utf8(list.slice(1)).unwrap());
+        assert_eq!("a", str::from_utf8(list.slice(2)).unwrap());
+        assert_eq!("test", str::from_utf8(list.slice(3)).unwrap());
+    }
+
+}
\ No newline at end of file
diff --git a/rust/src/memory.rs b/rust/src/memory.rs
new file mode 100644
index 000000000..5cecaa13d
--- /dev/null
+++ b/rust/src/memory.rs
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::mem;
+use libc;
+
+use super::error::Error;
+
+const ALIGNMENT: usize = 64;
+
+pub fn allocate_aligned(size: i64) -> Result<*const u8, Error> {
+    unsafe {
+        let mut page: *mut libc::c_void = mem::uninitialized();
+        let result = libc::posix_memalign(&mut page, ALIGNMENT, size as usize);
+        match result {
+            0 => Ok(mem::transmute::<*mut libc::c_void, *const u8>(page)),
+            _ => Err(Error::from("Failed to allocate memory")),
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_allocate() {
+        let _ = allocate_aligned(32 * 1024).unwrap();
+    }
+
+}


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [Rust] Start native Rust Implementation
> ---------------------------------------
>
>                 Key: ARROW-2361
>                 URL: https://issues.apache.org/jira/browse/ARROW-2361
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Rust
>            Reporter: Andy Grove
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> I'm creating this Jira to track work to donate an work-in-progress native 
> Rust implementation of Arrow.
> I am actively developing this and relying on it for the memory model of my 
> DataFusion project. I would like to donate the code I have now and start 
> working on it under the Apache Arrow project.
> Here is the PR: https://github.com/apache/arrow/pull/1804
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to