Hi all,
I'm Patrick Stuedi from IBM Research Zurich. Most of my work over the
last years has been on re-thinking the interface between distributed
data processing and I/O (both network and storage) with focus on
performance. In this context I've been working at the core of the
Crail d
I found the xml file that is used to render the crail project status
page, but in order to commit the updates via SVN (which I tried) it
seems I need access permissions (or Karma if I read the Apache doc),
where can we request permissions to access the svn?
-Patrick
On Wed, Dec 13, 2017 at 11:18
Indeed it looks like the repo is up and ready? We need to prepare a
few things for the first import. How are the permissions handled?
-Patrick
On Mon, Dec 18, 2017 at 9:53 AM, Jonas Pfefferle wrote:
> Makes sense to me.
>
> Jonas
>
> On Fri, 15 Dec 2017 13:44:19 -0800
>
> Julian Hyde wrote:
>
Hi Dawn,
Not a stupid question at all, a very relevant question. Storage
disaggregation is one of THE key use cases we have built Crail for.
The goal is to enable existing data processing platforms frameworks to
perform efficiently on remote storage (remote as seen from the compute
nodes). This is
Hi all,
If the github repo is synced with git repo only in one direction, then
what is the recommended way to handle new code contributions
(including code reviews)? We see two options here:
1) Code contributions are issued as PRs on the Crail Apache github
(and reviewed there), then merged outsi
Hi Bairen,
Your comment is just on spot. The development of a c++ Api for crail is one
of the top items on the roadmap, in partical to facilitate the integration
into tensorflow and serverless. In fact i started drafting a prototype two
weeks ago that i wanted to share soon. If you are interested
could encourage a
> couple of students contributing code to Crail at this very stage if we see
> fit. It could also bring novel system/networking research opportunities to
> our lab.
>
> Let me know how we could better work together.
>
> Best,
> Bairen
>
> > On 17 Feb
Another interesting opportunity would be the development of a gpu storage
tier based on gpu direct
On Feb 17, 2018 4:00 PM, "Patrick Stuedi" wrote:
> That's great, one of the main goals of crail being an apache incubator
> project is to get more people involved in the devel
Thanks for sharing the doc. This is very much in line with the work we
are doing since 6-7 years (https://github.com/zrlio,
https://researcher.watson.ibm.com/researcher/view.php?person=zurich-stu).
Please let us know if there is a specific aspect you would like to get
involved in the context of Cra
+1
-Patrick
On Thu, Apr 12, 2018 at 2:15 PM, Animesh Trivedi
wrote:
> Nice job Jonas.
>
> My vote : +1
>
> Thanks,
> --
> Animesh
>
> On Thu, Apr 12, 2018 at 1:33 PM, Jonas Pfefferle wrote:
>
>> Hi all,
>>
>> I packaged the source and updated the history for our first source release.
>>
>> Than
+1 (non-binding)
Thanks Jonas for putting together this release candidate! It looks fine to me.
-Patrick
On Wed, Apr 25, 2018 at 4:26 PM, Adrian Schuepbach wrote:
> +1
>
> Thanks for fixing everything commented on rc1.
>
> It looks like everything is fine to release rc2.
>
> Cheers
> Adrian
>
>
Luciano, Julian,
Is the idea that we put the artifacts into a staging directory (for
maven central?) prior to starting the IPMC vote?
If not required, can we make the call the IPMC vote? Source release is
prepared, PPMC vote has passed, so from our side everything is ready.
Thanks,
Patrick
O
Hi all,
I know this is short notice, but in case you are attending Spark Summit'18
in San Francisco please consider joining my session on serverless machine
learning using Spark[1]. This is exciting new work based on Crail.
-Patrick
[1]
https://databricks.com/session/serverless-machine-learning-
Hey Sumit,
Data locality in Crail simply means that there is a locality API
(getLocations(file, offset, length) that can be called by applications to
find out where a particular range of data is physically located, i.e. where
the corresponding blocks are. Crail has such an API, like HDFS also has
Julian, thanks for the reminder! We definitely want to create a twitter
account for Crail, hopefully we can get this done in the next couple of
days...
-Patrick
On Mon, Jun 25, 2018 at 8:01 PM, Julian Hyde wrote:
> How does the Crail community feel about creating a Twitter account for the
> proj
checksum and signatures checked. builds from source.
+1
Thanks Jonas and Animesh for preparing the release.
-Patrick
On Tue, Oct 23, 2018 at 11:06 AM Adrian Schuepbach
wrote:
> Hi all
>
> My vote is +1.
>
> I looked at the source and binary tarballs. Looks fine to me.
>
> Thanks
> Adrian
>
+1.
Thanks everyone for putting the release together!
-Patrick
On Sat, Oct 27, 2018 at 12:10 PM Animesh Trivedi
wrote:
> Thanks Jonas for preparing it
> - checked checksum (both binary and source tar.gz)
> - checked signature (both binary and source tar.gz)
> - compared the release hash checko
Hi Mani,
Thanks for reaching out, we're always looking for new contributors. One
thing we are planning is to create a few starter JIRAs, hopefully within
the next week or so. You can then just pick, discuss on, or create a PR.
Alternatively if you have a particular idea yourself feel free to creat
+1
- Compiling from source works
- Running from binary works
-Patrick
On Tue, Nov 13, 2018 at 11:43 AM Jonas Pfefferle wrote:
> Hi all,
>
>
> We prepared a new release to address the issues found in rc4. This is a
> call
> for a vote on releasing Apache Crail 1.1-incubating, release candidate
+1
+ Compiles from source
+ Runs from binaries
Thanks!
-Patrick
On Thu, Nov 15, 2018 at 5:16 PM Jonas Pfefferle wrote:
> Hi all
>
>
> For another round, we prepared a new release to address the issues found
> in
> rc5. This is a call to vote on releasing Apache Crail 1.1-incubating,
> release
Hi,
[https://issues.apache.org/jira/browse/CRAIL-86]
GPU integration with Crail is an important item on our TODO list and
has been on the roadmap since a while. There are two (maybe more) ways
GPUs could be integrated in Crail:
a) as a GPU tier: here, the memories of individual GPUs in the clust
+1 from me too, we should create the infra JIRA as Luciano suggested.
Jonas has interacted with the infra team in the past.
On Sun, Jan 6, 2019 at 8:09 PM Felix Cheung wrote:
>
> Hi there - any more vote/comment?
>
>
>
> From: Luciano Resende
> Sent: Friday, Janu
atrick
>
> As discussed earlier can you please assign some jira to me . i am very much
> excited to contribute to this platform.
>
>
> Thanks and Regards
>
> On Mon, Nov 12, 2018 at 2:48 AM Patrick Stuedi wrote:
>>
>> Hi Mani,
>>
>> Thanks for reachi
Done. Just posted a tweet.
On Fri, Jan 25, 2019 at 7:59 PM Julian Hyde wrote:
> May I suggest you tweet a link to the blog post? Wes and I are active on
> twitter and can help bring Crail to a wider audience.
>
> > On Jan 25, 2019, at 4:48 AM, Jonas Pfefferle wrote:
> >
> > Hi @all
> >
> >
> >
HI all,
A new blog post is available on disaggregating shuffle data in distributed
data processing workloads:
http://crail.incubator.apache.org/blog/2019/03/disaggregation.html
-Patrick
Hi Nieng,
Yes, we are working on it. The plan is to build a Crail-based TF "Dataset"
that fits into tf.data. The repo
https://github.com/patrickstuedi/tensorflow-crail contains a rough skeleton
of that code. crail-tensorflow is based on Crail Native, a new C++ Crail
client side implementation (htt
HI William,
You have to differentiate the server side registration from the client side
registration. The links above are server side. There we allocate memory in
larger segments (defined by the config variable crail.allocationSize).
AllocationSize must be a multiples of crail.bufferSize which is
etween different connections which can keep
> state on the NIC small)?
> Based on code study and discussion, I suppose it's close to 2.
>
> Again, many thanks.
>
> William
>
> On Mon, Mar 25, 2019 at 2:42 AM Patrick Stuedi wrote:
>
> > HI William,
> >
> >
The closing issue is related to using Crail for input/output. The changes
Adrian made just earlier today are changes on the shuffle plugin. Are you
using the Crail shuffle plugin at all? If not then the changes of Adrian
are not relevant to you.
-Patrick
On Wed, Jun 19, 2019 at 3:22 PM David Cres
rent configs.
>
>
>
> Regards,
>
>
>
>David
>
>
>
>
>
>
> From: Patrick Stuedi
> Sent: Wednesday, June 19, 2019 6:29:11 AM
> To: dev@crail.apache.org
> Cc: Jonas Pfefferle; d...@crail.incubator.apache.org
> Subject
,
>
>
>
>David
>
>
>
> ____
> From: Patrick Stuedi
> Sent: Wednesday, June 19, 2019 6:40:10 AM
> To: dev@crail.apache.org
> Cc: Jonas Pfefferle; d...@crail.incubator.apache.org
> Subject: Re: Crail used as type 2 storage for TeraSort does not catch the
> "
There is a bug currently in NaRPC which increases the likelyhood of hangs
in Crail/TCP as the data sizes increase. We have identified the actual
problem in NaRPC but didn't get to fixing it so far. I can look into this.
-Patrick
On Wed, Aug 21, 2019 at 12:35 AM 'Ben Sidhom' via zrlio-users <
zrli
It looks to me like you have localmap enabled (it's actually true by
default) which is an optimization where for access to local blocks (served
by a local datanode) mmap is used. Somehow it seems the local endpoint
can't find the directory where the data is (which shouldn't be). In anycase
you can
Hi Shashank,
Short answers to your questions:
1) The NVMf tier is a data tier, it stores data just like the RDMA tier or
the TCP tier. This is a bug in the documentation as far as I can see.
2) you need to start a NVMf datanode. The datanode will register the NVMf
resources and target with the C
that
more clear.
-Patrick
On Wed, Sep 25, 2019 at 7:46 PM Patrick Stuedi wrote:
> Hi Shashank,
>
> Short answers to your questions:
>
> 1) The NVMf tier is a data tier, it stores data just like the RDMA tier or
> the TCP tier. This is a bug in the documentation as far as I can
+1
* Release builds from source without errors
* Basic setup and tools work fine
Minor comment, we could give a better error message for the case where
JAVA_HOME is not set.
Cheers,
Patrick
On Tue, Dec 3, 2019 at 3:13 PM Adrian Schuepbach <
adrian.schuepb...@gribex.net> wrote:
> Hi all,
>
>
Adrian, thanks for providing this info. Looking forward to the
implementation.
Best regards,
Patrick
On Wed, May 27, 2020 at 2:32 AM Adrian Schüpbach <
adrian.schuepb...@gribex.net> wrote:
> Dear all
>
> Crail supports dynamically adding new datanodes, while the Crail cluster
> is running.
>
>
Hi Laurent
Yes crail://localhost:9060/mypath would be the path you want to use in your
Spark program to access Crail. Did you check if your Crail deployment is up
and running and can be accessed using the crail hdfs client (./bin/crail
fs)? If the crail hdfs client works access via Spark should be
I think offboarding the project from Apache Incubator is a reasonable step
to consider given the activity in the last year. I would be fine with such
a step if others are ok with it. I see Crail being used in benchmarks in
research papers every once in a while but I"m not aware of a bigger use
case
Hi all,
I share the opinions of Jonas and Bernard, happy to move on to an official
voting process.
Best regards,
Patrick
On Tue, May 31, 2022 at 1:30 PM bernard metzler wrote:
> On 31/05/2022 09:54, Jonas Pfefferle wrote:
> > Julian,
> >
> > Thanks for starting the discussion. As previously
+1
-Patrick
On Thu, Jun 2, 2022 at 8:58 AM Jonas Pfefferle wrote:
> +1
>
> On Wed, 1 Jun 2022 12:28:07 -0700
> Julian Hyde wrote:
> > Crail has been in incubation since late 2017 [1]. For the last
> >couple
> > of years, activity has been low [2], and the chances of attracting
> >new
> > c
41 matches
Mail list logo