Re: Kicking off incubation

2017-12-06 Thread Patrick Stuedi
Hi all, I'm Patrick Stuedi from IBM Research Zurich. Most of my work over the last years has been on re-thinking the interface between distributed data processing and I/O (both network and storage) with focus on performance. In this context I've been working at the core of the Crail d

Re: Incubation status page

2017-12-14 Thread Patrick Stuedi
I found the xml file that is used to render the crail project status page, but in order to commit the updates via SVN (which I tried) it seems I need access permissions (or Karma if I read the Apache doc), where can we request permissions to access the svn? -Patrick On Wed, Dec 13, 2017 at 11:18

Re: Source code hosting

2017-12-18 Thread Patrick Stuedi
Indeed it looks like the repo is up and ready? We need to prepare a few things for the first import. How are the permissions handled? -Patrick On Mon, Dec 18, 2017 at 9:53 AM, Jonas Pfefferle wrote: > Makes sense to me. > > Jonas > > On Fri, 15 Dec 2017 13:44:19 -0800 > > Julian Hyde wrote: >

Re: A stupid question

2018-01-25 Thread Patrick Stuedi
Hi Dawn, Not a stupid question at all, a very relevant question. Storage disaggregation is one of THE key use cases we have built Crail for. The goal is to enable existing data processing platforms frameworks to perform efficiently on remote storage (remote as seen from the compute nodes). This is

Re: Github pull requests

2018-02-14 Thread Patrick Stuedi
Hi all, If the github repo is synced with git repo only in one direction, then what is the recommended way to handle new code contributions (including code reviews)? We see two options here: 1) Code contributions are issued as PRs on the Crail Apache github (and reviewed there), then merged outsi

Re: C (or language agnostic) API for Crail

2018-02-17 Thread Patrick Stuedi
Hi Bairen, Your comment is just on spot. The development of a c++ Api for crail is one of the top items on the roadmap, in partical to facilitate the integration into tensorflow and serverless. In fact i started drafting a prototype two weeks ago that i wanted to share soon. If you are interested

Re: C (or language agnostic) API for Crail

2018-02-17 Thread Patrick Stuedi
could encourage a > couple of students contributing code to Crail at this very stage if we see > fit. It could also bring novel system/networking research opportunities to > our lab. > > Let me know how we could better work together. > > Best, > Bairen > > > On 17 Feb

Re: C (or language agnostic) API for Crail

2018-02-17 Thread Patrick Stuedi
Another interesting opportunity would be the development of a gpu storage tier based on gpu direct On Feb 17, 2018 4:00 PM, "Patrick Stuedi" wrote: > That's great, one of the main goals of crail being an apache incubator > project is to get more people involved in the devel

Re: C (or language agnostic) API for Crail

2018-02-20 Thread Patrick Stuedi
Thanks for sharing the doc. This is very much in line with the work we are doing since 6-7 years (https://github.com/zrlio, https://researcher.watson.ibm.com/researcher/view.php?person=zurich-stu). Please let us know if there is a specific aspect you would like to get involved in the context of Cra

Re: [VOTE] Crail v1.0-rc0 source release

2018-04-12 Thread Patrick Stuedi
+1 -Patrick On Thu, Apr 12, 2018 at 2:15 PM, Animesh Trivedi wrote: > Nice job Jonas. > > My vote : +1 > > Thanks, > -- > Animesh > > On Thu, Apr 12, 2018 at 1:33 PM, Jonas Pfefferle wrote: > >> Hi all, >> >> I packaged the source and updated the history for our first source release. >> >> Than

Re: [VOTE] Crail v1.0-rc2 source release

2018-04-25 Thread Patrick Stuedi
+1 (non-binding) Thanks Jonas for putting together this release candidate! It looks fine to me. -Patrick On Wed, Apr 25, 2018 at 4:26 PM, Adrian Schuepbach wrote: > +1 > > Thanks for fixing everything commented on rc1. > > It looks like everything is fine to release rc2. > > Cheers > Adrian > >

Re: Release Next Steps, was Re: [VOTE] Crail v1.0-rc2 source release

2018-05-07 Thread Patrick Stuedi
Luciano, Julian, Is the idea that we put the artifacts into a staging directory (for maven central?) prior to starting the IPMC vote? If not required, can we make the call the IPMC vote? Source release is prepared, PPMC vote has passed, so from our side everything is ready. Thanks, Patrick O

Spark Summit'18: Serverless machine learning using Apache Crail

2018-06-05 Thread Patrick Stuedi
Hi all, I know this is short notice, but in case you are attending Spark Summit'18 in San Francisco please consider joining my session on serverless machine learning using Spark[1]. This is exciting new work based on Crail. -Patrick [1] https://databricks.com/session/serverless-machine-learning-

Re: Crail and data locality question

2018-06-14 Thread Patrick Stuedi
Hey Sumit, Data locality in Crail simply means that there is a locality API (getLocations(file, offset, length) that can be called by applications to find out where a particular range of data is physically located, i.e. where the corresponding blocks are. Crail has such an API, like HDFS also has

Re: Twitter

2018-06-26 Thread Patrick Stuedi
Julian, thanks for the reminder! We definitely want to create a twitter account for Crail, hopefully we can get this done in the next couple of days... -Patrick On Mon, Jun 25, 2018 at 8:01 PM, Julian Hyde wrote: > How does the Crail community feel about creating a Twitter account for the > proj

Re: [VOTE] Release of Apache Crail-1.1-incubating [rc2]

2018-10-23 Thread Patrick Stuedi
checksum and signatures checked. builds from source. +1 Thanks Jonas and Animesh for preparing the release. -Patrick On Tue, Oct 23, 2018 at 11:06 AM Adrian Schuepbach wrote: > Hi all > > My vote is +1. > > I looked at the source and binary tarballs. Looks fine to me. > > Thanks > Adrian >

Re: [VOTE] Release of Apache Crail-1.1-incubating [rc3]

2018-10-27 Thread Patrick Stuedi
+1. Thanks everyone for putting the release together! -Patrick On Sat, Oct 27, 2018 at 12:10 PM Animesh Trivedi wrote: > Thanks Jonas for preparing it > - checked checksum (both binary and source tar.gz) > - checked signature (both binary and source tar.gz) > - compared the release hash checko

Re: Introduction

2018-11-11 Thread Patrick Stuedi
Hi Mani, Thanks for reaching out, we're always looking for new contributors. One thing we are planning is to create a few starter JIRAs, hopefully within the next week or so. You can then just pick, discuss on, or create a PR. Alternatively if you have a particular idea yourself feel free to creat

Re: [VOTE] Release of Apache Crail-1.1-incubating [rc5]

2018-11-15 Thread Patrick Stuedi
+1 - Compiling from source works - Running from binary works -Patrick On Tue, Nov 13, 2018 at 11:43 AM Jonas Pfefferle wrote: > Hi all, > > > We prepared a new release to address the issues found in rc4. This is a > call > for a vote on releasing Apache Crail 1.1-incubating, release candidate

Re: [VOTE] Release of Apache Crail-1.1-incubating [rc6]

2018-11-17 Thread Patrick Stuedi
+1 + Compiles from source + Runs from binaries Thanks! -Patrick On Thu, Nov 15, 2018 at 5:16 PM Jonas Pfefferle wrote: > Hi all > > > For another round, we prepared a new release to address the issues found > in > rc5. This is a call to vote on releasing Apache Crail 1.1-incubating, > release

Re: Crail on support GPU memory status?

2018-11-19 Thread Patrick Stuedi
Hi, [https://issues.apache.org/jira/browse/CRAIL-86] GPU integration with Crail is an important item on our TODO list and has been on the roadmap since a while. There are two (maybe more) ways GPUs could be integrated in Crail: a) as a GPU tier: here, the memories of individual GPUs in the clust

Re: [NOTICE] Mandatory migration of git repositories to gitbox.apache.org

2019-01-06 Thread Patrick Stuedi
+1 from me too, we should create the infra JIRA as Luciano suggested. Jonas has interacted with the infra team in the past. On Sun, Jan 6, 2019 at 8:09 PM Felix Cheung wrote: > > Hi there - any more vote/comment? > > > > From: Luciano Resende > Sent: Friday, Janu

Re: Introduction

2019-01-08 Thread Patrick Stuedi
atrick > > As discussed earlier can you please assign some jira to me . i am very much > excited to contribute to this platform. > > > Thanks and Regards > > On Mon, Nov 12, 2018 at 2:48 AM Patrick Stuedi wrote: >> >> Hi Mani, >> >> Thanks for reachi

Re: New blog post: Call overhead Python -> C/C++

2019-01-25 Thread Patrick Stuedi
Done. Just posted a tweet. On Fri, Jan 25, 2019 at 7:59 PM Julian Hyde wrote: > May I suggest you tweet a link to the blog post? Wes and I are active on > twitter and can help bring Crail to a wider audience. > > > On Jan 25, 2019, at 4:48 AM, Jonas Pfefferle wrote: > > > > Hi @all > > > > > >

New blog post: disaggregating shuffle data

2019-03-05 Thread Patrick Stuedi
HI all, A new blog post is available on disaggregating shuffle data in distributed data processing workloads: http://crail.incubator.apache.org/blog/2019/03/disaggregation.html -Patrick

Re: Tensorflow and Crail Integration

2019-03-19 Thread Patrick Stuedi
Hi Nieng, Yes, we are working on it. The plan is to build a Crail-based TF "Dataset" that fits into tf.data. The repo https://github.com/patrickstuedi/tensorflow-crail contains a rough skeleton of that code. crail-tensorflow is based on Crail Native, a new C++ Crail client side implementation (htt

Re: RDMA and Crail Implementation

2019-03-24 Thread Patrick Stuedi
HI William, You have to differentiate the server side registration from the client side registration. The links above are server side. There we allocate memory in larger segments (defined by the config variable crail.allocationSize). AllocationSize must be a multiples of crail.bufferSize which is

Re: RDMA and Crail Implementation

2019-03-25 Thread Patrick Stuedi
etween different connections which can keep > state on the NIC small)? > Based on code study and discussion, I suppose it's close to 2. > > Again, many thanks. > > William > > On Mon, Mar 25, 2019 at 2:42 AM Patrick Stuedi wrote: > > > HI William, > > > >

Re: Crail used as type 2 storage for TeraSort does not catch the "finished" signal

2019-06-19 Thread Patrick Stuedi
The closing issue is related to using Crail for input/output. The changes Adrian made just earlier today are changes on the shuffle plugin. Are you using the Crail shuffle plugin at all? If not then the changes of Adrian are not relevant to you. -Patrick On Wed, Jun 19, 2019 at 3:22 PM David Cres

Re: Crail used as type 2 storage for TeraSort does not catch the "finished" signal

2019-06-19 Thread Patrick Stuedi
rent configs. > > > > Regards, > > > >David > > > > > > > From: Patrick Stuedi > Sent: Wednesday, June 19, 2019 6:29:11 AM > To: dev@crail.apache.org > Cc: Jonas Pfefferle; d...@crail.incubator.apache.org > Subject

Re: Crail used as type 2 storage for TeraSort does not catch the "finished" signal

2019-06-19 Thread Patrick Stuedi
, > > > >David > > > > ____ > From: Patrick Stuedi > Sent: Wednesday, June 19, 2019 6:40:10 AM > To: dev@crail.apache.org > Cc: Jonas Pfefferle; d...@crail.incubator.apache.org > Subject: Re: Crail used as type 2 storage for TeraSort does not catch the > "

Re: [zrlio-users] Getting Crail to work over TCP

2019-08-20 Thread Patrick Stuedi
There is a bug currently in NaRPC which increases the likelyhood of hangs in Crail/TCP as the data sizes increase. We have identified the actual problem in NaRPC but didn't get to fixing it so far. I can look into this. -Patrick On Wed, Aug 21, 2019 at 12:35 AM 'Ben Sidhom' via zrlio-users < zrli

Re: Local RDMA data path missing error in crail-client

2019-09-24 Thread Patrick Stuedi
It looks to me like you have localmap enabled (it's actually true by default) which is an optimization where for access to local blocks (served by a local datanode) mmap is used. Somehow it seems the local endpoint can't find the directory where the data is (which shouldn't be). In anycase you can

Re: Deploying Crail with NVMf storage backend

2019-09-25 Thread Patrick Stuedi
Hi Shashank, Short answers to your questions: 1) The NVMf tier is a data tier, it stores data just like the RDMA tier or the TCP tier. This is a bug in the documentation as far as I can see. 2) you need to start a NVMf datanode. The datanode will register the NVMf resources and target with the C

Re: Deploying Crail with NVMf storage backend

2019-09-25 Thread Patrick Stuedi
that more clear. -Patrick On Wed, Sep 25, 2019 at 7:46 PM Patrick Stuedi wrote: > Hi Shashank, > > Short answers to your questions: > > 1) The NVMf tier is a data tier, it stores data just like the RDMA tier or > the TCP tier. This is a bug in the documentation as far as I can

Re: [VOTE] Release of Apache Crail-1.2-incubating rc2

2019-12-04 Thread Patrick Stuedi
+1 * Release builds from source without errors * Basic setup and tools work fine Minor comment, we could give a better error message for the case where JAVA_HOME is not set. Cheers, Patrick On Tue, Dec 3, 2019 at 3:13 PM Adrian Schuepbach < adrian.schuepb...@gribex.net> wrote: > Hi all, > >

Re: Mechanism that allow datanode to leave

2020-05-27 Thread Patrick Stuedi
Adrian, thanks for providing this info. Looking forward to the implementation. Best regards, Patrick On Wed, May 27, 2020 at 2:32 AM Adrian Schüpbach < adrian.schuepb...@gribex.net> wrote: > Dear all > > Crail supports dynamically adding new datanodes, while the Crail cluster > is running. > >

Re: HDFS adapter

2021-04-19 Thread Patrick Stuedi
Hi Laurent Yes crail://localhost:9060/mypath would be the path you want to use in your Spark program to access Crail. Did you check if your Crail deployment is up and running and can be accessed using the crail hdfs client (./bin/crail fs)? If the crail hdfs client works access via Spark should be

Re: Board reports

2022-04-24 Thread Patrick Stuedi
I think offboarding the project from Apache Incubator is a reasonable step to consider given the activity in the last year. I would be fine with such a step if others are ok with it. I see Crail being used in benchmarks in research papers every once in a while but I"m not aware of a bigger use case

Re: [DISCUSS] Retire Crail from the Incubator

2022-06-01 Thread Patrick Stuedi
Hi all, I share the opinions of Jonas and Bernard, happy to move on to an official voting process. Best regards, Patrick On Tue, May 31, 2022 at 1:30 PM bernard metzler wrote: > On 31/05/2022 09:54, Jonas Pfefferle wrote: > > Julian, > > > > Thanks for starting the discussion. As previously

Re: [VOTE] Retire Crail from Incubator

2022-06-02 Thread Patrick Stuedi
+1 -Patrick On Thu, Jun 2, 2022 at 8:58 AM Jonas Pfefferle wrote: > +1 > > On Wed, 1 Jun 2022 12:28:07 -0700 > Julian Hyde wrote: > > Crail has been in incubation since late 2017 [1]. For the last > >couple > > of years, activity has been low [2], and the chances of attracting > >new > > c