Re: How to write a Blob using an OutputStream?
Looks fine, I might switch to composition, but that's just a style nit. Ex. PutOnCloseOutputStream extends FilterOutputString { private final okio.Buffer buffer = new okio.Buffer(); ... PutOnCloseOutputStream(…){ super(buffer.outputStream()); … } @Override public void close() throws IOException { // put buffer.inputStream() with length buffer.size() } } On Tue, Aug 5, 2014 at 4:27 PM, Steve Kingsland steve.kingsl...@opower.com wrote: This wasn't terribly complicated to handle using a ByteArrayOutputStream, once I fixed the callers to not closeQuietly()... Here's the calling code, that has to return an OutputStream: public OutputStream getOutputStream(String containerName, String resourceName) throws IOException { return new JcloudsObjectWritingByteArrayOutputStream(this.blobStoreContext.getBlobStore(), containerName, resourceName); } And here's what JcloudsObjectWritingByteArrayOutputStream looks like (it's a bit long, so I put it in a gist): https://gist.github.com/skingsland/d2341cd52cd36c6cbb6f It's working ok with filesystem and in-memory object stores, but I'm running into some (apparently-unrelated) errors with the particular object store I'm trying to use (Ceph via S3 API). I'll save those for another email... I'd love to hear feedback on this approach. And thanks everyone for your help! Steve Kingsland Senior Software Engineer Opower We’re hiring! See jobs here On Tue, Aug 5, 2014 at 5:52 PM, Adrian Cole adrian.f.c...@gmail.com wrote: jclouds currently doesn't have a direct path to the outputstream (or channel), and even if it did, things mentioned by gaul would still be true (ex. may need content-length up front). jclouds doesn't have a direct path to becoming netty, so I wouldn't get too excited about full-bore async. Chunking, multipart, etc. over streams are very possible, though. Personally, I'd recommend using something like okio buffer (or some other buffer) and making that easier to work with (if it isn't already). https://github.com/square/okio Hope this helps, -A On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com wrote: With buffered streams, for example, close() causes buffers to be flushed (which is technically what you are doing). So yes, you can get some serious exceptions when closing. From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Tuesday, August 05, 2014 9:06 AM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice improvement over java.io.ByteArrayOutputStream (at least for my purposes), thanks Zack! The problem I'm running into is actually with the caller's Closeables.closeQuietly(documentOutputStream); call. That catches any IOException that's thrown from close() and logs it, instead of throwing it. That won't work for me, since I won't know if there was an error writing to the blob store until close() is called on my OutputStream. I can of course change the caller to use different error-handling for closing the stream, but it makes me wonder if using the close() method to upload the blob is the right approach. If you're given an OutputStream to write to, you'd expect the real errors to come from the write() methods, and not the close() method, right? Steve Kingsland Senior Software Engineer Opower We’re hiring! See jobs here On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.com wrote: Your code seems fine. I have used http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html in the past to convert between stream types, but it seems like it doesn't match your case very well. Note you might have to do writeBytesToBlob() before super.close(), but you can test that. Let us know how it turns out! From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Monday, August 04, 2014 9:22 PM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends
RE: How to write a Blob using an OutputStream?
Your code seems fine. I have used http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html in the past to convert between stream types, but it seems like it doesn't match your case very well. Note you might have to do writeBytesToBlob() before super.close(), but you can test that. Let us know how it turns out! From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Monday, August 04, 2014 9:22 PM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream { // these are all set in the constructor private BlobStore blobStore; private String containerName, blobName; // the client will have to call this when he's finished writing, so this is our chance to upload the blob, // now that we have the full payload in memory @Override public void close() throws IOException { super.close(); writeBytesToBlob(); } private void writeBytesToBlob() { byte[] payload = toByteArray(); Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size) .build(); blobStore.putBlob(containerName, blob); } } Aside from the weird inversion of control going on and the requirement that close() be called, I think something simple like this - to buffer the bytes being written before uploading them to the blob store - might work for me. Thoughts? Steve Kingsland Senior Software Engineer http://www.opower.com/ Opower We’re hiring! See jobs herehttp://www.opower.com/careers On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.orgmailto:g...@apache.org wrote: On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote: Here is Kevin's example using PipedInputStream and PipedOutputStream: https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ I don't have the need to use different threads, though, so instead I'd do something like this? This will not work; putBlob blocks until the operation completes. Further you must use PipedInputStream/PipedOutputStream with separate threads to avoid deadlock, as its Javadoc states: http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html Unfortunately jclouds has poor support for asynchronous operations and you can really only fake the desired behavior with various InputStream. I strongly recommend trying to cast your solution into some kind of ByteSource or InputStream. And then when close() or flush() is called on the returned OutputStream, the blob is uploaded like magic? Is it OK that I'm not setting the content length? Some blobstores, specifically Amazon S3, require a content length, while others such as OpenStack Swift do not. -- Andrew Gaul http://gaul.org/
RE: How to write a Blob using an OutputStream?
With buffered streams, for example, close() causes buffers to be flushed (which is technically what you are doing). So yes, you can get some serious exceptions when closing. From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Tuesday, August 05, 2014 9:06 AM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice improvement over java.io.ByteArrayOutputStream (at least for my purposes), thanks Zack! The problem I'm running into is actually with the caller's Closeables.closeQuietly(documentOutputStream); call. That catches any IOException that's thrown from close() and logs it, instead of throwing it. That won't work for me, since I won't know if there was an error writing to the blob store until close() is called on my OutputStream. I can of course change the caller to use different error-handling for closing the stream, but it makes me wonder if using the close() method to upload the blob is the right approach. If you're given an OutputStream to write to, you'd expect the real errors to come from the write() methods, and not the close() method, right? Steve Kingsland Senior Software Engineer http://www.opower.com/ Opower We’re hiring! See jobs herehttp://www.opower.com/careers On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.commailto:zack.shoy...@rackspace.com wrote: Your code seems fine. I have used http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html in the past to convert between stream types, but it seems like it doesn't match your case very well. Note you might have to do writeBytesToBlob() before super.close(), but you can test that. Let us know how it turns out! From: Steve Kingsland [steve.kingsl...@opower.commailto:steve.kingsl...@opower.com] Sent: Monday, August 04, 2014 9:22 PM To: user@jclouds.apache.orgmailto:user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream { // these are all set in the constructor private BlobStore blobStore; private String containerName, blobName; // the client will have to call this when he's finished writing, so this is our chance to upload the blob, // now that we have the full payload in memory @Override public void close() throws IOException { super.close(); writeBytesToBlob(); } private void writeBytesToBlob() { byte[] payload = toByteArray(); Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size) .build(); blobStore.putBlob(containerName, blob); } } Aside from the weird inversion of control going on and the requirement that close() be called, I think something simple like this - to buffer the bytes being written before uploading them to the blob store - might work for me. Thoughts? Steve Kingsland Senior Software Engineer http://www.opower.com/ Opower We’re hiring! See jobs herehttp://www.opower.com/careers On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.orgmailto:g...@apache.org wrote: On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote: Here is Kevin's example using PipedInputStream and PipedOutputStream: https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ I don't have the need to use different threads, though, so instead I'd do something like this? This will not work; putBlob blocks until the operation completes. Further you must use PipedInputStream/PipedOutputStream with separate threads to avoid deadlock, as its Javadoc states: http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html Unfortunately jclouds has poor support for asynchronous operations and you can really only fake the desired behavior with various InputStream. I strongly recommend trying to cast your solution into some kind of ByteSource or InputStream. And then when close() or flush() is called on the returned OutputStream, the blob is uploaded like magic? Is it OK that I'm not setting the
Re: How to write a Blob using an OutputStream?
jclouds currently doesn't have a direct path to the outputstream (or channel), and even if it did, things mentioned by gaul would still be true (ex. may need content-length up front). jclouds doesn't have a direct path to becoming netty, so I wouldn't get too excited about full-bore async. Chunking, multipart, etc. over streams are very possible, though. Personally, I'd recommend using something like okio buffer (or some other buffer) and making that easier to work with (if it isn't already). https://github.com/square/okio Hope this helps, -A On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com wrote: With buffered streams, for example, close() causes buffers to be flushed (which is technically what you are doing). So yes, you can get some serious exceptions when closing. From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Tuesday, August 05, 2014 9:06 AM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice improvement over java.io.ByteArrayOutputStream (at least for my purposes), thanks Zack! The problem I'm running into is actually with the caller's Closeables.closeQuietly(documentOutputStream); call. That catches any IOException that's thrown from close() and logs it, instead of throwing it. That won't work for me, since I won't know if there was an error writing to the blob store until close() is called on my OutputStream. I can of course change the caller to use different error-handling for closing the stream, but it makes me wonder if using the close() method to upload the blob is the right approach. If you're given an OutputStream to write to, you'd expect the real errors to come from the write() methods, and not the close() method, right? Steve Kingsland Senior Software Engineer Opower We’re hiring! See jobs here On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.com wrote: Your code seems fine. I have used http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html in the past to convert between stream types, but it seems like it doesn't match your case very well. Note you might have to do writeBytesToBlob() before super.close(), but you can test that. Let us know how it turns out! From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Monday, August 04, 2014 9:22 PM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream { // these are all set in the constructor private BlobStore blobStore; private String containerName, blobName; // the client will have to call this when he's finished writing, so this is our chance to upload the blob, // now that we have the full payload in memory @Override public void close() throws IOException { super.close(); writeBytesToBlob(); } private void writeBytesToBlob() { byte[] payload = toByteArray(); Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size) .build(); blobStore.putBlob(containerName, blob); } } Aside from the weird inversion of control going on and the requirement that close() be called, I think something simple like this - to buffer the bytes being written before uploading them to the blob store - might work for me. Thoughts? Steve Kingsland Senior Software Engineer Opower We’re hiring! See jobs here On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote: Here is Kevin's example using PipedInputStream and PipedOutputStream: https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ I don't have the need to use different threads, though, so instead I'd do something like this? This will not work; putBlob blocks until the operation completes. Further you must use PipedInputStream/PipedOutputStream with separate threads to avoid
Re: How to write a Blob using an OutputStream?
This wasn't terribly complicated to handle using a ByteArrayOutputStream, once I fixed the callers to not closeQuietly()... Here's the calling code, that has to return an OutputStream: public OutputStream getOutputStream(String containerName, String resourceName) throws IOException { return new JcloudsObjectWritingByteArrayOutputStream(this.blobStoreContext.getBlobStore(), containerName, resourceName); } And here's what JcloudsObjectWritingByteArrayOutputStream looks like (it's a bit long, so I put it in a gist): https://gist.github.com/skingsland/d2341cd52cd36c6cbb6f It's working ok with filesystem and in-memory object stores, but I'm running into some (apparently-unrelated) errors with the particular object store I'm trying to use (Ceph via S3 API). I'll save those for another email... I'd love to hear feedback on this approach. And thanks everyone for your help! *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Tue, Aug 5, 2014 at 5:52 PM, Adrian Cole adrian.f.c...@gmail.com wrote: jclouds currently doesn't have a direct path to the outputstream (or channel), and even if it did, things mentioned by gaul would still be true (ex. may need content-length up front). jclouds doesn't have a direct path to becoming netty, so I wouldn't get too excited about full-bore async. Chunking, multipart, etc. over streams are very possible, though. Personally, I'd recommend using something like okio buffer (or some other buffer) and making that easier to work with (if it isn't already). https://github.com/square/okio Hope this helps, -A On Tue, Aug 5, 2014 at 2:33 PM, Zack Shoylev zack.shoy...@rackspace.com wrote: With buffered streams, for example, close() causes buffers to be flushed (which is technically what you are doing). So yes, you can get some serious exceptions when closing. From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Tuesday, August 05, 2014 9:06 AM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? org.apache.commons.io.output.ByteArrayOutputStream sounds like a nice improvement over java.io.ByteArrayOutputStream (at least for my purposes), thanks Zack! The problem I'm running into is actually with the caller's Closeables.closeQuietly(documentOutputStream); call. That catches any IOException that's thrown from close() and logs it, instead of throwing it. That won't work for me, since I won't know if there was an error writing to the blob store until close() is called on my OutputStream. I can of course change the caller to use different error-handling for closing the stream, but it makes me wonder if using the close() method to upload the blob is the right approach. If you're given an OutputStream to write to, you'd expect the real errors to come from the write() methods, and not the close() method, right? Steve Kingsland Senior Software Engineer Opower We’re hiring! See jobs here On Tue, Aug 5, 2014 at 7:21 AM, Zack Shoylev zack.shoy...@rackspace.com wrote: Your code seems fine. I have used http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/ByteArrayOutputStream.html in the past to convert between stream types, but it seems like it doesn't match your case very well. Note you might have to do writeBytesToBlob() before super.close(), but you can test that. Let us know how it turns out! From: Steve Kingsland [steve.kingsl...@opower.com] Sent: Monday, August 04, 2014 9:22 PM To: user@jclouds.apache.org Subject: Re: How to write a Blob using an OutputStream? OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream { // these are all set in the constructor private BlobStore blobStore; private String containerName, blobName; // the client will have to call this when he's finished writing, so this is our chance to upload the blob, // now that we have the full payload in memory @Override public void close() throws IOException { super.close();
Re: How to write a Blob using an OutputStream?
On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote: I'm trying to use jclouds to write to an S3-compatible object store (Ceph), and I'd like to use an OutputStream to write the payload for a Blob. How do I do this? I'm working on an existing system which uses a stream-based abstraction around all of the file I/O, that looks like this: public interface ResourceFactory { InputStream getInputStream(String resourcePath) throws IOException; OutputStream getOutputStream(String resourcePath) throws IOException; } I was able to implement getInputStream() for *reading* a blob from jclouds, but I'm not sure how to return an OutputStream for *writing* a blob. I know this question has already been asked https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems like a common-enough use case that it shouldn't be terribly complicated to implement. Can anyone provide suggestions for how to accomplish this? The best I could find is Payload#writeTo http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html, which accepts an OutputStream but is @Deprecated. Thanks in advance! Steve, I am not sure I understand your use case. putBlob consumes an input *source*, e.g., ByteSource or InputStream. Why do you want to provide it an output *sink*, e.g., OutputStream? If you have a special need, could you provide a custom implementation of ByteSource or InputStream, or use PipedInputStream/PipedOutputStream if you really must use an OutputStream? -- Andrew Gaul http://gaul.org/
Re: How to write a Blob using an OutputStream?
My use case is: 1. the calling code is generating content in memory, and wants an OutputStream to write it to (currently it's going to disk); 2. the putBlob() method wants a btye[], InputStream, etc. that it can read from. My problem is that *both* parties want to control the transaction. Here is what my calling code looks like: OutputStream documentOutputStream = null; try { documentOutputStream = this. *documentResourceFactory.getDocumentOutputStream(documentPath);* renderAndWriteDocument(renderContext, documentOutputStream); } catch (IOException e) { ... } finally { Closeables.closeQuietly(documentOutputStream); } I'm trying to create an implementation of DocumentResourceFactory that returns an OutputStream for writing the document to an Object Store using jclouds, instead of writing it to the local file system. I guess a stream-based API isn't really supported for writing to object stores... In my case, the files are small enough that I'm OK buffering them in memory. So what I'm planning to do, if there are no better options, is to create an OutputStream implementation that buffers the file contents, and uploads it to the blob store when flush()/close() is called. But that doesn't sound great, so I'm hoping maybe someone else has a better idea? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote: I'm trying to use jclouds to write to an S3-compatible object store (Ceph), and I'd like to use an OutputStream to write the payload for a Blob. How do I do this? I'm working on an existing system which uses a stream-based abstraction around all of the file I/O, that looks like this: public interface ResourceFactory { InputStream getInputStream(String resourcePath) throws IOException; OutputStream getOutputStream(String resourcePath) throws IOException; } I was able to implement getInputStream() for *reading* a blob from jclouds, but I'm not sure how to return an OutputStream for *writing* a blob. I know this question has already been asked https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems like a common-enough use case that it shouldn't be terribly complicated to implement. Can anyone provide suggestions for how to accomplish this? The best I could find is Payload#writeTo http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html , which accepts an OutputStream but is @Deprecated. Thanks in advance! Steve, I am not sure I understand your use case. putBlob consumes an input *source*, e.g., ByteSource or InputStream. Why do you want to provide it an output *sink*, e.g., OutputStream? If you have a special need, could you provide a custom implementation of ByteSource or InputStream, or use PipedInputStream/PipedOutputStream if you really must use an OutputStream? -- Andrew Gaul http://gaul.org/
Re: How to write a Blob using an OutputStream?
Please look at PipedInputStream/PipedOutputStream which should address this use case. On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote: My use case is: 1. the calling code is generating content in memory, and wants an OutputStream to write it to (currently it's going to disk); 2. the putBlob() method wants a btye[], InputStream, etc. that it can read from. My problem is that *both* parties want to control the transaction. Here is what my calling code looks like: OutputStream documentOutputStream = null; try { documentOutputStream = this. *documentResourceFactory.getDocumentOutputStream(documentPath);* renderAndWriteDocument(renderContext, documentOutputStream); } catch (IOException e) { ... } finally { Closeables.closeQuietly(documentOutputStream); } I'm trying to create an implementation of DocumentResourceFactory that returns an OutputStream for writing the document to an Object Store using jclouds, instead of writing it to the local file system. I guess a stream-based API isn't really supported for writing to object stores... In my case, the files are small enough that I'm OK buffering them in memory. So what I'm planning to do, if there are no better options, is to create an OutputStream implementation that buffers the file contents, and uploads it to the blob store when flush()/close() is called. But that doesn't sound great, so I'm hoping maybe someone else has a better idea? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote: I'm trying to use jclouds to write to an S3-compatible object store (Ceph), and I'd like to use an OutputStream to write the payload for a Blob. How do I do this? I'm working on an existing system which uses a stream-based abstraction around all of the file I/O, that looks like this: public interface ResourceFactory { InputStream getInputStream(String resourcePath) throws IOException; OutputStream getOutputStream(String resourcePath) throws IOException; } I was able to implement getInputStream() for *reading* a blob from jclouds, but I'm not sure how to return an OutputStream for *writing* a blob. I know this question has already been asked https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems like a common-enough use case that it shouldn't be terribly complicated to implement. Can anyone provide suggestions for how to accomplish this? The best I could find is Payload#writeTo http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html , which accepts an OutputStream but is @Deprecated. Thanks in advance! Steve, I am not sure I understand your use case. putBlob consumes an input *source*, e.g., ByteSource or InputStream. Why do you want to provide it an output *sink*, e.g., OutputStream? If you have a special need, could you provide a custom implementation of ByteSource or InputStream, or use PipedInputStream/PipedOutputStream if you really must use an OutputStream? -- Andrew Gaul http://gaul.org/ -- Andrew Gaul http://gaul.org/
Re: How to write a Blob using an OutputStream?
Thanks Andrew, I will. Can you provide any guidance, pseudo-code, examples, etc. on how I would use a PipedOutputStream to buffer the content that's being written, and upload it to a BlobStore? To put it differently: how can I use these classes to return an OutputStream that is capable of putting a blob in a blob store, all by itself? *Steve Kingsland* Senior Software Engineer * Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 8:30 PM, Andrew Gaul g...@apache.org wrote: Please look at PipedInputStream/PipedOutputStream which should address this use case. On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote: My use case is: 1. the calling code is generating content in memory, and wants an OutputStream to write it to (currently it's going to disk); 2. the putBlob() method wants a btye[], InputStream, etc. that it can read from. My problem is that *both* parties want to control the transaction. Here is what my calling code looks like: OutputStream documentOutputStream = null; try { documentOutputStream = this. *documentResourceFactory.getDocumentOutputStream(documentPath);* renderAndWriteDocument(renderContext, documentOutputStream); } catch (IOException e) { ... } finally { Closeables.closeQuietly(documentOutputStream); } I'm trying to create an implementation of DocumentResourceFactory that returns an OutputStream for writing the document to an Object Store using jclouds, instead of writing it to the local file system. I guess a stream-based API isn't really supported for writing to object stores... In my case, the files are small enough that I'm OK buffering them in memory. So what I'm planning to do, if there are no better options, is to create an OutputStream implementation that buffers the file contents, and uploads it to the blob store when flush()/close() is called. But that doesn't sound great, so I'm hoping maybe someone else has a better idea? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote: I'm trying to use jclouds to write to an S3-compatible object store (Ceph), and I'd like to use an OutputStream to write the payload for a Blob. How do I do this? I'm working on an existing system which uses a stream-based abstraction around all of the file I/O, that looks like this: public interface ResourceFactory { InputStream getInputStream(String resourcePath) throws IOException; OutputStream getOutputStream(String resourcePath) throws IOException; } I was able to implement getInputStream() for *reading* a blob from jclouds, but I'm not sure how to return an OutputStream for *writing* a blob. I know this question has already been asked https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems like a common-enough use case that it shouldn't be terribly complicated to implement. Can anyone provide suggestions for how to accomplish this? The best I could find is Payload#writeTo http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html , which accepts an OutputStream but is @Deprecated. Thanks in advance! Steve, I am not sure I understand your use case. putBlob consumes an input *source*, e.g., ByteSource or InputStream. Why do you want to provide it an output *sink*, e.g., OutputStream? If you have a special need, could you provide a custom implementation of ByteSource or InputStream, or use PipedInputStream/PipedOutputStream if you really must use an OutputStream? -- Andrew Gaul http://gaul.org/ -- Andrew Gaul http://gaul.org/
Re: How to write a Blob using an OutputStream?
Here is Kevin's example using PipedInputStream and PipedOutputStream: https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ I don't have the need to use different threads, though, so instead I'd do something like this? public OutputStream getOutputStream(String containerName, String resourceName) throws IOException { PipedInputStream in = new PipedInputStream(); PipedOutputStream out = new PipedOutputStream(in); BlobStore blobStore = this.blobStoreContext.getBlobStore(); Blob blob = blobStore.blobBuilder(resourceName).payload(in).build(); blobStore.putBlob(containerName, blob); return out; } And then when close() or flush() is called on the returned OutputStream, the blob is uploaded like magic? Is it OK that I'm not setting the content length? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 8:38 PM, Steve Kingsland steve.kingsl...@opower.com wrote: Thanks Andrew, I will. Can you provide any guidance, pseudo-code, examples, etc. on how I would use a PipedOutputStream to buffer the content that's being written, and upload it to a BlobStore? To put it differently: how can I use these classes to return an OutputStream that is capable of putting a blob in a blob store, all by itself? *Steve Kingsland* Senior Software Engineer * Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 8:30 PM, Andrew Gaul g...@apache.org wrote: Please look at PipedInputStream/PipedOutputStream which should address this use case. On Mon, Aug 04, 2014 at 08:10:49PM -0400, Steve Kingsland wrote: My use case is: 1. the calling code is generating content in memory, and wants an OutputStream to write it to (currently it's going to disk); 2. the putBlob() method wants a btye[], InputStream, etc. that it can read from. My problem is that *both* parties want to control the transaction. Here is what my calling code looks like: OutputStream documentOutputStream = null; try { documentOutputStream = this. *documentResourceFactory.getDocumentOutputStream(documentPath);* renderAndWriteDocument(renderContext, documentOutputStream); } catch (IOException e) { ... } finally { Closeables.closeQuietly(documentOutputStream); } I'm trying to create an implementation of DocumentResourceFactory that returns an OutputStream for writing the document to an Object Store using jclouds, instead of writing it to the local file system. I guess a stream-based API isn't really supported for writing to object stores... In my case, the files are small enough that I'm OK buffering them in memory. So what I'm planning to do, if there are no better options, is to create an OutputStream implementation that buffers the file contents, and uploads it to the blob store when flush()/close() is called. But that doesn't sound great, so I'm hoping maybe someone else has a better idea? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 7:53 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 04:39:15PM -0400, Steve Kingsland wrote: I'm trying to use jclouds to write to an S3-compatible object store (Ceph), and I'd like to use an OutputStream to write the payload for a Blob. How do I do this? I'm working on an existing system which uses a stream-based abstraction around all of the file I/O, that looks like this: public interface ResourceFactory { InputStream getInputStream(String resourcePath) throws IOException; OutputStream getOutputStream(String resourcePath) throws IOException; } I was able to implement getInputStream() for *reading* a blob from jclouds, but I'm not sure how to return an OutputStream for *writing* a blob. I know this question has already been asked https://groups.google.com/forum/#!topic/jclouds/F2pCt9i7TSg, but it seems like a common-enough use case that it shouldn't be terribly complicated to implement. Can anyone provide suggestions for how to accomplish this? The best I could find is Payload#writeTo http://demobox.github.io/jclouds-maven-site-1.7.2/1.7.2/jclouds/apidocs/org/jclouds/io/WriteTo.html , which accepts an OutputStream but is @Deprecated. Thanks in advance! Steve, I am not sure I understand your use case. putBlob consumes an input *source*, e.g., ByteSource or InputStream. Why do you want to provide it an output *sink*, e.g., OutputStream? If you have a special need, could you provide a custom implementation of ByteSource or InputStream, or use PipedInputStream/PipedOutputStream if you really must use an
Re: How to write a Blob using an OutputStream?
OK, then it appears that my calling code (which would be difficult and risky to change) is incompatible with jclouds' BlobStore API: my caller wants to obtain an OutputStream for writing to the blob store, and jclouds wants to obtain an InputStream for reading the blob's content that should be written. Therefore, my only solution is to buffer the blob data, either in memory or on disk, before uploading it to the blob store. Given that the documents I'm trying to write to the blob store will generally be small (1KB to 1MB), I'm going with a simple approach, for providing my caller with an OutputStream that they can use to write the blob's payload: class BlobWritingByteArrayOutputStream extends java.io.ByteArrayOutputStream { // these are all set in the constructor private BlobStore blobStore; private String containerName, blobName; // the client will have to call this when he's finished writing, so this is our chance to upload the blob, // now that we have the full payload in memory @Override public void close() throws IOException { super.close(); writeBytesToBlob(); } private void writeBytesToBlob() { byte[] payload = toByteArray(); Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size) .build(); blobStore.putBlob(containerName, blob); } } Aside from the weird inversion of control going on and the requirement that close() be called, I think something simple like this - to buffer the bytes being written before uploading them to the blob store - might work for me. Thoughts? *Steve Kingsland* Senior Software Engineer *Opower * http://www.opower.com/ *We’re hiring! See jobs here http://www.opower.com/careers * On Mon, Aug 4, 2014 at 9:05 PM, Andrew Gaul g...@apache.org wrote: On Mon, Aug 04, 2014 at 08:46:37PM -0400, Steve Kingsland wrote: Here is Kevin's example using PipedInputStream and PipedOutputStream: https://groups.google.com/d/msg/jclouds/F2pCt9i7TSg/AUF4AqOO0TMJ I don't have the need to use different threads, though, so instead I'd do something like this? This will not work; putBlob blocks until the operation completes. Further you must use PipedInputStream/PipedOutputStream with separate threads to avoid deadlock, as its Javadoc states: http://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html Unfortunately jclouds has poor support for asynchronous operations and you can really only fake the desired behavior with various InputStream. I strongly recommend trying to cast your solution into some kind of ByteSource or InputStream. And then when close() or flush() is called on the returned OutputStream, the blob is uploaded like magic? Is it OK that I'm not setting the content length? Some blobstores, specifically Amazon S3, require a content length, while others such as OpenStack Swift do not. -- Andrew Gaul http://gaul.org/