Re: How to write a Blob using an OutputStream?

2014-08-06 Thread Adrian Cole
Looks fine, I might switch to composition, but that's just a style nit. Ex.

PutOnCloseOutputStream extends FilterOutputString {
  private final okio.Buffer buffer = new okio.Buffer();
  ...

  PutOnCloseOutputStream(…){
super(buffer.outputStream());
…
  }

  @Override public void close() throws IOException {
// put buffer.inputStream() with length buffer.size()

  }

}

On Tue, Aug 5, 2014 at 4:27 PM, Steve Kingsland
steve.kingsl...@opower.com wrote:
 This wasn't terribly complicated to handle using a ByteArrayOutputStream,
 once I fixed the callers to not closeQuietly()...

 Here's the calling code, that has to return an OutputStream:

 public OutputStream getOutputStream(String containerName, String
 resourceName) throws IOException {
 return new
 JcloudsObjectWritingByteArrayOutputStream(this.blobStoreContext.getBlobStore(),
 containerName, resourceName);
 }

 And here's what JcloudsObjectWritingByteArrayOutputStream looks like (it's a
 bit long, so I put it in a gist):
 https://gist.github.com/skingsland/d2341cd52cd36c6cbb6f

 It's working ok with filesystem and in-memory object stores, but I'm running
 into some (apparently-unrelated) errors with the particular object store I'm
 trying to use (Ceph via S3 API). I'll save those for another email...

 I'd love to hear feedback on this approach. And thanks everyone for your
 help!



 Steve Kingsland


 Senior Software Engineer

 Opower


 We’re hiring! See jobs here



 On Tue, Aug 5, 2014 at 5:52 PM, Adrian Cole adrian.f.c...@gmail.com wrote:

 jclouds currently doesn't have a direct path to the outputstream (or
 channel), and even if it did, things mentioned by gaul would still be
 true (ex. may need content-length up front).

 jclouds doesn't have a direct path to becoming netty, so I wouldn't
 get too excited about full-bore async. Chunking, multipart, etc. over
 streams are very possible, though.

 Personally, I'd recommend using something like okio buffer (or some
 other buffer) and making that easier to work with (if it isn't
 already). https://github.com/square/okio

 Hope this helps,
 -A

 On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com
 wrote:
  With buffered streams, for example, close() causes buffers to be flushed
  (which is technically what you are doing).
  So yes, you can get some serious exceptions when closing.
 
  
  From: Steve Kingsland [steve.kingsl...@opower.com]
  Sent: Tuesday, August 05, 2014 9:06 AM
 
  To: user@jclouds.apache.org
  Subject: Re: How to write a Blob using an OutputStream?
 
  org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice
  improvement over java.io.ByteArrayOutputStream (at least for my
  purposes),
  thanks Zack!
 
  The problem I'm running into is actually with the caller's
  Closeables.closeQuietly(documentOutputStream); call. That catches any
  IOException that's thrown from close() and logs it, instead of throwing
  it.
  That won't work for me, since I won't know if there was an error writing
  to
  the blob store until close() is called on my OutputStream. I can of
  course
  change the caller to use different error-handling for closing the
  stream,
  but it makes me wonder if using the close() method to upload the blob is
  the
  right approach. If you're given an OutputStream to write to, you'd
  expect
  the real errors to come from the write() methods, and not the close()
  method, right?
 
 
  Steve Kingsland
 
 
  Senior Software Engineer
 
  Opower
 
 
  We’re hiring! See jobs here
 
 
 
  On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev
  zack.shoy...@rackspace.com
  wrote:
 
  Your code seems fine. I have used
 
  http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html
  in the past to convert between stream types, but it seems like it
  doesn't
  match your case very well.
 
  Note you might have to do writeBytesToBlob() before super.close(), but
  you
  can test that.
 
  Let us know how it turns out!
  
  From: Steve Kingsland [steve.kingsl...@opower.com]
  Sent: Monday, August 04, 2014 9:22 PM
  To: user@jclouds.apache.org
  Subject: Re: How to write a Blob using an OutputStream?
 
  OK, then it appears that my calling code (which would be difficult and
  risky to change) is incompatible with jclouds' BlobStore API: my caller
  wants to obtain an OutputStream for writing to the blob store, and
  jclouds
  wants to obtain an InputStream for reading the blob's content that
  should be
  written. Therefore, my only solution is to buffer the blob data, either
  in
  memory or on disk, before uploading it to the blob store.
 
  Given that the documents I'm trying to write to the blob store will
  generally be small (1KB to 1MB), I'm going with a simple approach, for
  providing my caller with an OutputStream that they can use to write the
  blob's payload:
 
  class BlobWritingByteArrayOutputStream extends
  

RE: How to write a Blob using an OutputStream?

2014-08-05 Thread Zack Shoylev
Your code seems fine. I have used 
http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html
 in the past to convert between stream types, but it seems like it doesn't 
match your case very well.

Note you might have to do writeBytesToBlob() before super.close(), but you can 
test that.

Let us know how it turns out!

From: Steve Kingsland [steve.kingsl...@opower.com]
Sent: Monday, August 04, 2014 9:22 PM
To: user@jclouds.apache.org
Subject: Re: How to write a Blob using an OutputStream?

OK, then it appears that my calling code (which would be difficult and risky to 
change) is incompatible with jclouds' BlobStore API: my caller wants to obtain 
an OutputStream for writing to the blob store, and jclouds wants to obtain an 
InputStream for reading the blob's content that should be written. Therefore, 
my only solution is to buffer the blob data, either in memory or on disk, 
before uploading it to the blob store.

Given that the documents I'm trying to write to the blob store will generally 
be small (1KB to 1MB), I'm going with a simple approach, for providing my 
caller with an OutputStream that they can use to write the blob's payload:

class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream {

// these are all set in the constructor
private BlobStore blobStore;
private String containerName, blobName;

// the client will have to call this when he's finished writing, so this is 
our chance to upload the blob,
// now that we have the full payload in memory
@Override
public void close() throws IOException {
super.close();

writeBytesToBlob();
}

private void writeBytesToBlob() {
byte[] payload = toByteArray();

Blob blob = blobStore.blobBuilder(blobName)
 .payload(payload)
 .contentLength(payload.size)
 .build();
blobStore.putBlob(containerName, blob);
}
}

Aside from the weird inversion of control going on and the requirement that 
close() be called, I think something simple like this - to buffer the bytes 
being written before uploading them to the blob store - might work for me.

Thoughts?





Steve Kingsland

Senior Software Engineer
http://www.opower.com/

Opower


We’re hiring! See jobs herehttp://www.opower.com/careers


On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul 
g...@apache.orgmailto:g...@apache.org wrote:
On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote:
 Here is Kevin's example using PipedInputStream and PipedOutputStream:
 https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ

 I don't have the need to use different threads, though, so instead I'd do
 something like this?

This will not work; putBlob blocks until the operation completes.
Further you must use PipedInputStream/PipedOutputStream with separate
threads to avoid deadlock, as its Javadoc states:

http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html

Unfortunately jclouds has poor support for asynchronous operations and
you can really only fake the desired behavior with various InputStream.
I strongly recommend trying to cast your solution into some kind of
ByteSource or InputStream.

 And then when close() or flush() is called on the returned OutputStream,
 the blob is uploaded like magic? Is it OK that I'm not setting the content
 length?

Some blobstores, specifically Amazon S3, require a content length, while
others such as OpenStack Swift do not.

--
Andrew Gaul
http://gaul.org/



RE: How to write a Blob using an OutputStream?

2014-08-05 Thread Zack Shoylev
With buffered streams, for example, close() causes buffers to be flushed (which 
is technically what you are doing).
So yes, you can get some serious exceptions when closing.


From: Steve Kingsland [steve.kingsl...@opower.com]
Sent: Tuesday, August 05, 2014 9:06 AM
To: user@jclouds.apache.org
Subject: Re: How to write a Blob using an OutputStream?

org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice 
improvement over java.io.ByteArrayOutputStream (at least for my purposes), 
thanks Zack!

The problem I'm running into is actually with the caller's 
Closeables.closeQuietly(documentOutputStream); call. That catches any 
IOException that's thrown from close() and logs it, instead of throwing it. 
That won't work for me, since I won't know if there was an error writing to the 
blob store until close() is called on my OutputStream. I can of course change 
the caller to use different error-handling for closing the stream, but it makes 
me wonder if using the close() method to upload the blob is the right approach. 
If you're given an OutputStream to write to, you'd expect the real errors to 
come from the write() methods, and not the close() method, right?



Steve Kingsland

Senior Software Engineer
http://www.opower.com/

Opower


We’re hiring! See jobs herehttp://www.opower.com/careers


On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev 
zack.shoy...@rackspace.commailto:zack.shoy...@rackspace.com wrote:
Your code seems fine. I have used 
http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html
 in the past to convert between stream types, but it seems like it doesn't 
match your case very well.

Note you might have to do writeBytesToBlob() before super.close(), but you can 
test that.

Let us know how it turns out!

From: Steve Kingsland 
[steve.kingsl...@opower.commailto:steve.kingsl...@opower.com]
Sent: Monday, August 04, 2014 9:22 PM
To: user@jclouds.apache.orgmailto:user@jclouds.apache.org
Subject: Re: How to write a Blob using an OutputStream?

OK, then it appears that my calling code (which would be difficult and risky to 
change) is incompatible with jclouds' BlobStore API: my caller wants to obtain 
an OutputStream for writing to the blob store, and jclouds wants to obtain an 
InputStream for reading the blob's content that should be written. Therefore, 
my only solution is to buffer the blob data, either in memory or on disk, 
before uploading it to the blob store.

Given that the documents I'm trying to write to the blob store will generally 
be small (1KB to 1MB), I'm going with a simple approach, for providing my 
caller with an OutputStream that they can use to write the blob's payload:

class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream {

// these are all set in the constructor
private BlobStore blobStore;
private String containerName, blobName;

// the client will have to call this when he's finished writing, so this is 
our chance to upload the blob,
// now that we have the full payload in memory
@Override
public void close() throws IOException {
super.close();

writeBytesToBlob();
}

private void writeBytesToBlob() {
byte[] payload = toByteArray();

Blob blob = blobStore.blobBuilder(blobName)
 .payload(payload)
 .contentLength(payload.size)
 .build();
blobStore.putBlob(containerName, blob);
}
}

Aside from the weird inversion of control going on and the requirement that 
close() be called, I think something simple like this - to buffer the bytes 
being written before uploading them to the blob store - might work for me.

Thoughts?





Steve Kingsland

Senior Software Engineer
http://www.opower.com/

Opower


We’re hiring! See jobs herehttp://www.opower.com/careers


On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul 
g...@apache.orgmailto:g...@apache.org wrote:
On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote:
 Here is Kevin's example using PipedInputStream and PipedOutputStream:
 https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ

 I don't have the need to use different threads, though, so instead I'd do
 something like this?

This will not work; putBlob blocks until the operation completes.
Further you must use PipedInputStream/PipedOutputStream with separate
threads to avoid deadlock, as its Javadoc states:

http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html

Unfortunately jclouds has poor support for asynchronous operations and
you can really only fake the desired behavior with various InputStream.
I strongly recommend trying to cast your solution into some kind of
ByteSource or InputStream.

 And then when close() or flush() is called on the returned OutputStream,
 the blob is uploaded like magic? Is it OK that I'm not setting the 

Re: How to write a Blob using an OutputStream?

2014-08-05 Thread Adrian Cole
jclouds currently doesn't have a direct path to the outputstream (or
channel), and even if it did, things mentioned by gaul would still be
true (ex. may need content-length up front).

jclouds doesn't have a direct path to becoming netty, so I wouldn't
get too excited about full-bore async. Chunking, multipart, etc. over
streams are very possible, though.

Personally, I'd recommend using something like okio buffer (or some
other buffer) and making that easier to work with (if it isn't
already). https://github.com/square/okio

Hope this helps,
-A

On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com wrote:
 With buffered streams, for example, close() causes buffers to be flushed
 (which is technically what you are doing).
 So yes, you can get some serious exceptions when closing.

 
 From: Steve Kingsland [steve.kingsl...@opower.com]
 Sent: Tuesday, August 05, 2014 9:06 AM

 To: user@jclouds.apache.org
 Subject: Re: How to write a Blob using an OutputStream?

 org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice
 improvement over java.io.ByteArrayOutputStream (at least for my purposes),
 thanks Zack!

 The problem I'm running into is actually with the caller's
 Closeables.closeQuietly(documentOutputStream); call. That catches any
 IOException that's thrown from close() and logs it, instead of throwing it.
 That won't work for me, since I won't know if there was an error writing to
 the blob store until close() is called on my OutputStream. I can of course
 change the caller to use different error-handling for closing the stream,
 but it makes me wonder if using the close() method to upload the blob is the
 right approach. If you're given an OutputStream to write to, you'd expect
 the real errors to come from the write() methods, and not the close()
 method, right?


 Steve Kingsland


 Senior Software Engineer

 Opower


 We’re hiring! See jobs here



 On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.com
 wrote:

 Your code seems fine. I have used
 http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html
 in the past to convert between stream types, but it seems like it doesn't
 match your case very well.

 Note you might have to do writeBytesToBlob() before super.close(), but you
 can test that.

 Let us know how it turns out!
 
 From: Steve Kingsland [steve.kingsl...@opower.com]
 Sent: Monday, August 04, 2014 9:22 PM
 To: user@jclouds.apache.org
 Subject: Re: How to write a Blob using an OutputStream?

 OK, then it appears that my calling code (which would be difficult and
 risky to change) is incompatible with jclouds' BlobStore API: my caller
 wants to obtain an OutputStream for writing to the blob store, and jclouds
 wants to obtain an InputStream for reading the blob's content that should be
 written. Therefore, my only solution is to buffer the blob data, either in
 memory or on disk, before uploading it to the blob store.

 Given that the documents I'm trying to write to the blob store will
 generally be small (1KB to 1MB), I'm going with a simple approach, for
 providing my caller with an OutputStream that they can use to write the
 blob's payload:

 class BlobWritingByteArrayOutputStream extends
 java.io.ByteArrayOutputStream {

 // these are all set in the constructor
 private BlobStore blobStore;
 private String containerName, blobName;

 // the client will have to call this when he's finished writing, so
 this is our chance to upload the blob,
 // now that we have the full payload in memory
 @Override
 public void close() throws IOException {
 super.close();

 writeBytesToBlob();
 }

 private void writeBytesToBlob() {
 byte[] payload = toByteArray();

 Blob blob = blobStore.blobBuilder(blobName)
  .payload(payload)
  .contentLength(payload.size)
  .build();
 blobStore.putBlob(containerName, blob);
 }
 }

 Aside from the weird inversion of control going on and the requirement
 that close() be called, I think something simple like this - to buffer the
 bytes being written before uploading them to the blob store - might work for
 me.

 Thoughts?




 Steve Kingsland


 Senior Software Engineer

 Opower


 We’re hiring! See jobs here



 On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.org wrote:

 On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote:
  Here is Kevin's example using PipedInputStream and PipedOutputStream:
  https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ
 
  I don't have the need to use different threads, though, so instead I'd
  do
  something like this?

 This will not work; putBlob blocks until the operation completes.
 Further you must use PipedInputStream/PipedOutputStream with separate
 threads to avoid 

Re: How to write a Blob using an OutputStream?

2014-08-05 Thread Steve Kingsland
This wasn't terribly complicated to handle using a ByteArrayOutputStream,
once I fixed the callers to not closeQuietly()...

Here's the calling code, that has to return an OutputStream:

public OutputStream getOutputStream(String containerName, String
resourceName) throws IOException {
return new
JcloudsObjectWritingByteArrayOutputStream(this.blobStoreContext.getBlobStore(),
containerName, resourceName);
}

And here's what JcloudsObjectWritingByteArrayOutputStream looks like (it's
a bit long, so I put it in a gist):
https://gist.github.com/skingsland/d2341cd52cd36c6cbb6f

It's working ok with filesystem and in-memory object stores, but I'm
running into some (apparently-unrelated) errors with the particular object
store I'm trying to use (Ceph via S3 API). I'll save those for another
email...

I'd love to hear feedback on this approach. And thanks everyone for your
help!



*Steve Kingsland*

Senior Software Engineer

*Opower * http://www.opower.com/


*We’re hiring! See jobs here http://www.opower.com/careers *


On Tue, Aug 5, 2014 at 5:52 PM, Adrian Cole adrian.f.c...@gmail.com wrote:

 jclouds currently doesn't have a direct path to the outputstream (or
 channel), and even if it did, things mentioned by gaul would still be
 true (ex. may need content-length up front).

 jclouds doesn't have a direct path to becoming netty, so I wouldn't
 get too excited about full-bore async. Chunking, multipart, etc. over
 streams are very possible, though.

 Personally, I'd recommend using something like okio buffer (or some
 other buffer) and making that easier to work with (if it isn't
 already). https://github.com/square/okio

 Hope this helps,
 -A

 On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com
 wrote:
  With buffered streams, for example, close() causes buffers to be flushed
  (which is technically what you are doing).
  So yes, you can get some serious exceptions when closing.
 
  
  From: Steve Kingsland [steve.kingsl...@opower.com]
  Sent: Tuesday, August 05, 2014 9:06 AM
 
  To: user@jclouds.apache.org
  Subject: Re: How to write a Blob using an OutputStream?
 
  org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice
  improvement over java.io.ByteArrayOutputStream (at least for my
 purposes),
  thanks Zack!
 
  The problem I'm running into is actually with the caller's
  Closeables.closeQuietly(documentOutputStream); call. That catches any
  IOException that's thrown from close() and logs it, instead of throwing
 it.
  That won't work for me, since I won't know if there was an error writing
 to
  the blob store until close() is called on my OutputStream. I can of
 course
  change the caller to use different error-handling for closing the stream,
  but it makes me wonder if using the close() method to upload the blob is
 the
  right approach. If you're given an OutputStream to write to, you'd expect
  the real errors to come from the write() methods, and not the close()
  method, right?
 
 
  Steve Kingsland
 
 
  Senior Software Engineer
 
  Opower
 
 
  We’re hiring! See jobs here
 
 
 
  On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.com
 
  wrote:
 
  Your code seems fine. I have used
 
 http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html
  in the past to convert between stream types, but it seems like it
 doesn't
  match your case very well.
 
  Note you might have to do writeBytesToBlob() before super.close(), but
 you
  can test that.
 
  Let us know how it turns out!
  
  From: Steve Kingsland [steve.kingsl...@opower.com]
  Sent: Monday, August 04, 2014 9:22 PM
  To: user@jclouds.apache.org
  Subject: Re: How to write a Blob using an OutputStream?
 
  OK, then it appears that my calling code (which would be difficult and
  risky to change) is incompatible with jclouds' BlobStore API: my caller
  wants to obtain an OutputStream for writing to the blob store, and
 jclouds
  wants to obtain an InputStream for reading the blob's content that
 should be
  written. Therefore, my only solution is to buffer the blob data, either
 in
  memory or on disk, before uploading it to the blob store.
 
  Given that the documents I'm trying to write to the blob store will
  generally be small (1KB to 1MB), I'm going with a simple approach, for
  providing my caller with an OutputStream that they can use to write the
  blob's payload:
 
  class BlobWritingByteArrayOutputStream extends
  java.io.ByteArrayOutputStream {
 
  // these are all set in the constructor
  private BlobStore blobStore;
  private String containerName, blobName;
 
  // the client will have to call this when he's finished writing, so
  this is our chance to upload the blob,
  // now that we have the full payload in memory
  @Override
  public void close() throws IOException {
  super.close();
 
  

Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Andrew Gaul
On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote:
 I'm trying to use jclouds to write to an S3-compatible object store (Ceph),
 and I'd like to use an OutputStream to write the payload for a Blob. How do
 I do this?
 
 I'm working on an existing system which uses a stream-based abstraction
 around all of the file I/O, that looks like this:
 
 public interface ResourceFactory {
 InputStream getInputStream(String resourcePath) throws IOException;
 
 OutputStream getOutputStream(String resourcePath) throws IOException;
 }
 
 I was able to implement getInputStream() for *reading* a blob from jclouds,
 but I'm not sure how to return an OutputStream for *writing* a blob.
 
 I know this question has already been asked
 https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems
 like a common-enough use case that it shouldn't be terribly complicated to
 implement. Can anyone provide suggestions for how to accomplish this?
 
 The best I could find is Payload#writeTo
 http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html,
 which accepts an OutputStream but is @Deprecated. Thanks in advance!

Steve, I am not sure I understand your use case.  putBlob consumes an
input *source*, e.g., ByteSource or InputStream.  Why do you want to
provide it an output *sink*, e.g., OutputStream?  If you have a special
need, could you provide a custom implementation of ByteSource or
InputStream, or use PipedInputStream/PipedOutputStream if you really
must use an OutputStream?

-- 
Andrew Gaul
http://gaul.org/


Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Steve Kingsland
My use case is:

1. the calling code is generating content in memory, and wants an
OutputStream to write it to (currently it's going to disk);

2. the putBlob() method wants a btye[], InputStream, etc. that it can read
from.

My problem is that *both* parties want to control the transaction. Here is
what my calling code looks like:

OutputStream documentOutputStream = null;
try {
   documentOutputStream = this.
*documentResourceFactory.getDocumentOutputStream(documentPath);*

   renderAndWriteDocument(renderContext, documentOutputStream);
}
catch (IOException e) {
   ...
}
finally {
   Closeables.closeQuietly(documentOutputStream);
}

I'm trying to create an implementation of DocumentResourceFactory that
returns an OutputStream for writing the document to an Object Store using
jclouds, instead of writing it to the local file system. I guess a
stream-based API isn't really supported for writing to object stores...

In my case, the files are small enough that I'm OK buffering them in
memory. So what I'm planning to do, if there are no better options, is to
create an OutputStream implementation that buffers the file contents, and
uploads it to the blob store when flush()/close() is called. But that
doesn't sound great, so I'm hoping maybe someone else has a better idea?



*Steve Kingsland*

Senior Software Engineer

*Opower * http://www.opower.com/


*We’re hiring! See jobs here http://www.opower.com/careers *


On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote:

 On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote:
  I'm trying to use jclouds to write to an S3-compatible object store
 (Ceph),
  and I'd like to use an OutputStream to write the payload for a Blob. How
 do
  I do this?
 
  I'm working on an existing system which uses a stream-based abstraction
  around all of the file I/O, that looks like this:
 
  public interface ResourceFactory {
  InputStream getInputStream(String resourcePath) throws IOException;
 
  OutputStream getOutputStream(String resourcePath) throws IOException;
  }
 
  I was able to implement getInputStream() for *reading* a blob from
 jclouds,
  but I'm not sure how to return an OutputStream for *writing* a blob.
 
  I know this question has already been asked
  https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it
 seems
  like a common-enough use case that it shouldn't be terribly complicated
 to
  implement. Can anyone provide suggestions for how to accomplish this?
 
  The best I could find is Payload#writeTo
  
 http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html
 ,
  which accepts an OutputStream but is @Deprecated. Thanks in advance!

 Steve, I am not sure I understand your use case.  putBlob consumes an
 input *source*, e.g., ByteSource or InputStream.  Why do you want to
 provide it an output *sink*, e.g., OutputStream?  If you have a special
 need, could you provide a custom implementation of ByteSource or
 InputStream, or use PipedInputStream/PipedOutputStream if you really
 must use an OutputStream?

 --
 Andrew Gaul
 http://gaul.org/



Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Andrew Gaul
Please look at PipedInputStream/PipedOutputStream which should address
this use case.

On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote:
 My use case is:
 
 1. the calling code is generating content in memory, and wants an
 OutputStream to write it to (currently it's going to disk);
 
 2. the putBlob() method wants a btye[], InputStream, etc. that it can read
 from.
 
 My problem is that *both* parties want to control the transaction. Here is
 what my calling code looks like:
 
 OutputStream documentOutputStream = null;
 try {
documentOutputStream = this.
 *documentResourceFactory.getDocumentOutputStream(documentPath);*
 
renderAndWriteDocument(renderContext, documentOutputStream);
 }
 catch (IOException e) {
...
 }
 finally {
Closeables.closeQuietly(documentOutputStream);
 }
 
 I'm trying to create an implementation of DocumentResourceFactory that
 returns an OutputStream for writing the document to an Object Store using
 jclouds, instead of writing it to the local file system. I guess a
 stream-based API isn't really supported for writing to object stores...
 
 In my case, the files are small enough that I'm OK buffering them in
 memory. So what I'm planning to do, if there are no better options, is to
 create an OutputStream implementation that buffers the file contents, and
 uploads it to the blob store when flush()/close() is called. But that
 doesn't sound great, so I'm hoping maybe someone else has a better idea?
 
 
 
 *Steve Kingsland*
 
 Senior Software Engineer
 
 *Opower * http://www.opower.com/
 
 
 *We’re hiring! See jobs here http://www.opower.com/careers *
 
 
 On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote:
 
  On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote:
   I'm trying to use jclouds to write to an S3-compatible object store
  (Ceph),
   and I'd like to use an OutputStream to write the payload for a Blob. How
  do
   I do this?
  
   I'm working on an existing system which uses a stream-based abstraction
   around all of the file I/O, that looks like this:
  
   public interface ResourceFactory {
   InputStream getInputStream(String resourcePath) throws IOException;
  
   OutputStream getOutputStream(String resourcePath) throws IOException;
   }
  
   I was able to implement getInputStream() for *reading* a blob from
  jclouds,
   but I'm not sure how to return an OutputStream for *writing* a blob.
  
   I know this question has already been asked
   https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it
  seems
   like a common-enough use case that it shouldn't be terribly complicated
  to
   implement. Can anyone provide suggestions for how to accomplish this?
  
   The best I could find is Payload#writeTo
   
  http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html
  ,
   which accepts an OutputStream but is @Deprecated. Thanks in advance!
 
  Steve, I am not sure I understand your use case.  putBlob consumes an
  input *source*, e.g., ByteSource or InputStream.  Why do you want to
  provide it an output *sink*, e.g., OutputStream?  If you have a special
  need, could you provide a custom implementation of ByteSource or
  InputStream, or use PipedInputStream/PipedOutputStream if you really
  must use an OutputStream?
 
  --
  Andrew Gaul
  http://gaul.org/
 

-- 
Andrew Gaul
http://gaul.org/


Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Steve Kingsland
Thanks Andrew, I will. Can you provide any guidance, pseudo-code, examples,
etc. on how I would use a PipedOutputStream to buffer the content that's
being written, and upload it to a BlobStore?

To put it differently: how can I use these classes to return an
OutputStream that is capable of putting a blob in a blob store, all by
itself?



*Steve Kingsland*

Senior Software Engineer

* Opower * http://www.opower.com/


*We’re hiring! See jobs here http://www.opower.com/careers *


On Mon, Aug 4, 2014 at 8:30 PM, Andrew Gaul g...@apache.org wrote:

 Please look at PipedInputStream/PipedOutputStream which should address
 this use case.

 On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote:
  My use case is:
 
  1. the calling code is generating content in memory, and wants an
  OutputStream to write it to (currently it's going to disk);
 
  2. the putBlob() method wants a btye[], InputStream, etc. that it can
 read
  from.
 
  My problem is that *both* parties want to control the transaction. Here
 is
  what my calling code looks like:
 
  OutputStream documentOutputStream = null;
  try {
 documentOutputStream = this.
  *documentResourceFactory.getDocumentOutputStream(documentPath);*
 
 renderAndWriteDocument(renderContext, documentOutputStream);
  }
  catch (IOException e) {
 ...
  }
  finally {
 Closeables.closeQuietly(documentOutputStream);
  }
 
  I'm trying to create an implementation of DocumentResourceFactory that
  returns an OutputStream for writing the document to an Object Store using
  jclouds, instead of writing it to the local file system. I guess a
  stream-based API isn't really supported for writing to object stores...
 
  In my case, the files are small enough that I'm OK buffering them in
  memory. So what I'm planning to do, if there are no better options, is to
  create an OutputStream implementation that buffers the file contents, and
  uploads it to the blob store when flush()/close() is called. But that
  doesn't sound great, so I'm hoping maybe someone else has a better idea?
 
 
 
  *Steve Kingsland*
 
  Senior Software Engineer
 
  *Opower * http://www.opower.com/
 
 
  *We’re hiring! See jobs here http://www.opower.com/careers *
 
 
  On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote:
 
   On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote:
I'm trying to use jclouds to write to an S3-compatible object store
   (Ceph),
and I'd like to use an OutputStream to write the payload for a Blob.
 How
   do
I do this?
   
I'm working on an existing system which uses a stream-based
 abstraction
around all of the file I/O, that looks like this:
   
public interface ResourceFactory {
InputStream getInputStream(String resourcePath) throws
 IOException;
   
OutputStream getOutputStream(String resourcePath) throws
 IOException;
}
   
I was able to implement getInputStream() for *reading* a blob from
   jclouds,
but I'm not sure how to return an OutputStream for *writing* a blob.
   
I know this question has already been asked
https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but
 it
   seems
like a common-enough use case that it shouldn't be terribly
 complicated
   to
implement. Can anyone provide suggestions for how to accomplish this?
   
The best I could find is Payload#writeTo

  
 http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html
   ,
which accepts an OutputStream but is @Deprecated. Thanks in advance!
  
   Steve, I am not sure I understand your use case.  putBlob consumes an
   input *source*, e.g., ByteSource or InputStream.  Why do you want to
   provide it an output *sink*, e.g., OutputStream?  If you have a special
   need, could you provide a custom implementation of ByteSource or
   InputStream, or use PipedInputStream/PipedOutputStream if you really
   must use an OutputStream?
  
   --
   Andrew Gaul
   http://gaul.org/
  

 --
 Andrew Gaul
 http://gaul.org/



Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Steve Kingsland
Here is Kevin's example using PipedInputStream and PipedOutputStream:
https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ

I don't have the need to use different threads, though, so instead I'd do
something like this?

public OutputStream getOutputStream(String containerName, String
resourceName) throws IOException {
PipedInputStream in = new PipedInputStream();
PipedOutputStream out = new PipedOutputStream(in);

BlobStore blobStore = this.blobStoreContext.getBlobStore();
Blob blob = blobStore.blobBuilder(resourceName).payload(in).build();

blobStore.putBlob(containerName, blob);

return out;
}

And then when close() or flush() is called on the returned OutputStream,
the blob is uploaded like magic? Is it OK that I'm not setting the content
length?



*Steve Kingsland*

Senior Software Engineer

*Opower * http://www.opower.com/


*We’re hiring! See jobs here http://www.opower.com/careers *


On Mon, Aug 4, 2014 at 8:38 PM, Steve Kingsland steve.kingsl...@opower.com
wrote:

 Thanks Andrew, I will. Can you provide any guidance, pseudo-code,
 examples, etc. on how I would use a PipedOutputStream to buffer the content
 that's being written, and upload it to a BlobStore?

  To put it differently: how can I use these classes to return an
 OutputStream that is capable of putting a blob in a blob store, all by
 itself?



 *Steve Kingsland*

 Senior Software Engineer

 * Opower * http://www.opower.com/


 *We’re hiring! See jobs here http://www.opower.com/careers *


 On Mon, Aug 4, 2014 at 8:30 PM, Andrew Gaul g...@apache.org wrote:

 Please look at PipedInputStream/PipedOutputStream which should address
 this use case.

 On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote:
  My use case is:
 
  1. the calling code is generating content in memory, and wants an
  OutputStream to write it to (currently it's going to disk);
 
  2. the putBlob() method wants a btye[], InputStream, etc. that it can
 read
  from.
 
  My problem is that *both* parties want to control the transaction. Here
 is
  what my calling code looks like:
 
  OutputStream documentOutputStream = null;
  try {
 documentOutputStream = this.
  *documentResourceFactory.getDocumentOutputStream(documentPath);*
 
 renderAndWriteDocument(renderContext, documentOutputStream);
  }
  catch (IOException e) {
 ...
  }
  finally {
 Closeables.closeQuietly(documentOutputStream);
  }
 
  I'm trying to create an implementation of DocumentResourceFactory that
  returns an OutputStream for writing the document to an Object Store
 using
  jclouds, instead of writing it to the local file system. I guess a
  stream-based API isn't really supported for writing to object stores...
 
  In my case, the files are small enough that I'm OK buffering them in
  memory. So what I'm planning to do, if there are no better options, is
 to
  create an OutputStream implementation that buffers the file contents,
 and
  uploads it to the blob store when flush()/close() is called. But that
  doesn't sound great, so I'm hoping maybe someone else has a better idea?
 
 
 
  *Steve Kingsland*
 
  Senior Software Engineer
 
  *Opower * http://www.opower.com/
 
 
  *We’re hiring! See jobs here http://www.opower.com/careers *
 
 
  On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote:
 
   On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote:
I'm trying to use jclouds to write to an S3-compatible object store
   (Ceph),
and I'd like to use an OutputStream to write the payload for a
 Blob. How
   do
I do this?
   
I'm working on an existing system which uses a stream-based
 abstraction
around all of the file I/O, that looks like this:
   
public interface ResourceFactory {
InputStream getInputStream(String resourcePath) throws
 IOException;
   
OutputStream getOutputStream(String resourcePath) throws
 IOException;
}
   
I was able to implement getInputStream() for *reading* a blob from
   jclouds,
but I'm not sure how to return an OutputStream for *writing* a blob.
   
I know this question has already been asked
https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but
 it
   seems
like a common-enough use case that it shouldn't be terribly
 complicated
   to
implement. Can anyone provide suggestions for how to accomplish
 this?
   
The best I could find is Payload#writeTo

  
 http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html
   ,
which accepts an OutputStream but is @Deprecated. Thanks in advance!
  
   Steve, I am not sure I understand your use case.  putBlob consumes an
   input *source*, e.g., ByteSource or InputStream.  Why do you want to
   provide it an output *sink*, e.g., OutputStream?  If you have a
 special
   need, could you provide a custom implementation of ByteSource or
   InputStream, or use PipedInputStream/PipedOutputStream if you really
   must use an 

Re: How to write a Blob using an OutputStream?

2014-08-04 Thread Steve Kingsland
OK, then it appears that my calling code (which would be difficult and
risky to change) is incompatible with jclouds' BlobStore API: my caller
wants to obtain an OutputStream for writing to the blob store, and jclouds
wants to obtain an InputStream for reading the blob's content that should
be written. Therefore, my only solution is to buffer the blob data, either
in memory or on disk, before uploading it to the blob store.

Given that the documents I'm trying to write to the blob store will
generally be small (1KB to 1MB), I'm going with a simple approach, for
providing my caller with an OutputStream that they can use to write the
blob's payload:

class BlobWritingByteArrayOutputStream extends
java.io.ByteArrayOutputStream {

// these are all set in the constructor
private BlobStore blobStore;
private String containerName, blobName;

// the client will have to call this when he's finished writing, so
this is our chance to upload the blob,
// now that we have the full payload in memory
@Override
public void close() throws IOException {
super.close();

writeBytesToBlob();
}

private void writeBytesToBlob() {
byte[] payload = toByteArray();

Blob blob = blobStore.blobBuilder(blobName)
 .payload(payload)
 .contentLength(payload.size)
 .build();
blobStore.putBlob(containerName, blob);
}
}

Aside from the weird inversion of control going on and the requirement that
close() be called, I think something simple like this - to buffer the bytes
being written before uploading them to the blob store - might work for me.

Thoughts?




*Steve Kingsland*

Senior Software Engineer

*Opower * http://www.opower.com/


*We’re hiring! See jobs here http://www.opower.com/careers *


On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.org wrote:

 On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote:
  Here is Kevin's example using PipedInputStream and PipedOutputStream:
  https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ
 
  I don't have the need to use different threads, though, so instead I'd do
  something like this?

 This will not work; putBlob blocks until the operation completes.
 Further you must use PipedInputStream/PipedOutputStream with separate
 threads to avoid deadlock, as its Javadoc states:

 http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html

 Unfortunately jclouds has poor support for asynchronous operations and
 you can really only fake the desired behavior with various InputStream.
 I strongly recommend trying to cast your solution into some kind of
 ByteSource or InputStream.

  And then when close() or flush() is called on the returned OutputStream,
  the blob is uploaded like magic? Is it OK that I'm not setting the
 content
  length?

 Some blobstores, specifically Amazon S3, require a content length, while
 others such as OpenStack Swift do not.

 --
 Andrew Gaul
 http://gaul.org/