A document has been updated:
http://cocoon.zones.apache.org/daisy/documentation/681.html
Document ID: 681
Branch: main
Language: default
Name: Creating a Reader (unchanged)
Document Type: Document (unchanged)
Updated on: 8/25/05 8:51:34 PM
Updated by: Berin Loritsch
A new version has been created, state: publish
Parts
=====
Content
-------
This part has been updated.
Mime type: text/xml (unchanged)
File name: (unchanged)
Size: 18304 bytes (previous version: 9063 bytes)
Content diff:
(85 equal lines skipped)
import java.io.Serializable;
import java.sql.Connection;
import java.sql.ResultSet;
+++ import java.sql.SQLException;
import java.sql.Statement;
import java.util.Map;
(34 equal lines skipped)
</pre>
<p>Now we are going to override the <tt>service()</tt> method and implement
the
--- <tt>dispose()</tt> method to get and cleanup after ourselves. First lets
start
--- with getting the DataSourceComponent. Because Cocoon is configured to deal
with
+++ <tt>dispose()</tt> method to get and cleanup after ourselves. First lets
start
+++ with getting the DataSourceComponent. Because Cocoon is configured to deal
with
multiple databases, you will need to use a ServiceSelector to choose the
DataSourceComponent corresponding to your desired database.</p>
(8 equal lines skipped)
</pre>
<p class="note">The <tt>@Override</tt> annotation above is used by the Java
--- compiler to ensure that you are overriding a parent class's method. It only
--- works in Java 5. If you are developing against an earlier version of Java
--- remove that line so that you can compile the class. That goes for every
time
--- you see it.</p>
+++ compiler to ensure that you are overriding a parent class's method. It only
+++ works in Java 5. If you are developing against an earlier version of Java
remove
+++ that line so that you can compile the class. That goes for every time you
see
+++ it.</p>
<p>We ensured that we called the superclass's <tt>service()</tt> method so
that
--- we didn't upset the expectations of anyone wanting to extend our class.
Keeping
+++ we didn't upset the expectations of anyone wanting to extend our class.
Keeping
the user's expectations in mind always helps to produce a good product--and
in
--- this case the user is a developer. Next, we retrieved the selector for the
--- DataSourceComponent and stored it in the class field we created earlier.
Then
--- we did the same for the actual DataSouceComponent itself. Now we have
access to
--- the component when we need it. We didn't get an actual connection yet
because
--- the connections are pooled. If we held onto a connection for the life of
the
+++ this case the user is a developer. Next, we retrieved the selector for the
+++ DataSourceComponent and stored it in the class field we created earlier.
Then we
+++ did the same for the actual DataSouceComponent itself. Now we have access
to the
+++ component when we need it. We didn't get an actual connection yet because
the
+++ connections are pooled. If we held onto a connection for the life of the
component then we would run out and the application would come to a
screaching
halt waiting for a connection to become available.</p>
<p>Since we are still dealing with managing the component itself, let's do
the
--- cleanup code next. The Avalon framework uses the
<tt>Disposable.dispose()</tt>
+++ cleanup code next. The Avalon framework uses the
<tt>Disposable.dispose()</tt>
callback method to let the component know when it is safe to release all the
components it is using and perform other cleanup.</p>
(7 equal lines skipped)
</pre>
<p>While setting the fields to <tt>null</tt> might not be necessary with
modern
--- day garbage collectors, it still doesn't hurt. By releasing those
components we
+++ day garbage collectors, it still doesn't hurt. By releasing those
components we
ensure that Cocoon can shut down nicely and safely when it is time.</p>
<h3>Make sure PDFs Work</h3>
<p>Since we expect to have PDF documents in our database alongside pictures
and
--- other types of documents, we need to make sure they display properly.
Since the
+++ other types of documents, we need to make sure they display properly. Since
the
bug in the IE Acrobat Reader plugin wasn't fixed until version 7 we need to
make
--- sure the content length is returned. There is some overhead with this as
Cocoon
+++ sure the content length is returned. There is some overhead with this as
Cocoon
has to cache the results to get the content length, but because we are
going to
--- cache it anyway there is little difference on when it gets sent to the
cache.
+++ cache it anyway there is little difference on when it gets sent to the
cache.
This is how we do it:</p>
<pre> @Override
(5 equal lines skipped)
<h3>Setting up for the Read (Cache directives, finding the resource,
etc.)</h3>
+++ <p>In the <tt>setup()</tt> method we need to ask the database for the
+++ meta-information about our attachment. You may be curious why we need to
do it
+++ in the setup as opposed to the generate phase of the Reader. The answer is
+++ simply this: the sitemap has already asked the Reader for all caching
related
+++ information and it is too late to do it then. We'll assume the attachments
+++ table is really simple and it has an ID, a mimeType, a timeStamp, and the
+++ attachment content. We need to get our component and query it. You can
never
+++ rely on your connection pooling code to clean up your open statements and
+++ resultsets, so we will have to do that ourselves. Let's add some more class
+++ fields to support the cache directives and cache the blob reference:</p>
+++
+++ <pre> private TimeStampValidity m_validity;
+++ private InputStream m_content;
+++ private String m_mimeType;
+++ </pre>
+++
+++ <p>Since our AttachementReader is pooled and recyclable, let's make sure we
+++ clean these values up when the AttachmentReader is returned to the pool:</p>
+++
+++ <pre> @Override
+++ public void recycle()
+++ {
+++ super.recycle();
+++ if ( null != m_content ) try{ m_content.close(); } catch(Exception
e) {/*ignore*/}
+++ m_content = null;
+++ m_validity = null;
+++ m_mimeType = null;
+++ }
+++ </pre>
+++
+++ <p>The next code snippet is the content of the setup() method from the code
+++ skeleton above. Let's break it down to understand what's going on. First
we
+++ call the superclass's version of the method so that all expectations of the
+++ class hold true:</p>
+++
+++ <pre> super.setup(sourceResolver, objectModel, src, params);
+++ </pre>
+++
+++ <p>Next we set up the holders for the connection, statement and resultset so
+++ that we can clean them up later.</p>
+++
+++ <pre> Connection con = null;
+++ ResultSet rs = null;
+++ Statement stm = null;
+++ </pre>
+++
+++ <p>Now we have the meat of the method. We get a connectino from the
+++ DataSourceComponent, and for good measure we set the AutoCommit to false.
You
+++ can adjust this to your taste, but for a read we really don't need
+++ transactions. There is some standard query code next, and the part I want
to
+++ point out is how we deal with the resultset. If you notice we have two
courses
+++ of action depending on whether the record was found or not. If we did find
the
+++ record we set the mimeType, validity, and content fields for the class.
+++ Otherwise, we throw <tt>ResourceNotFoundException</tt>. That exception is
how
+++ Cocoon knows to differentiate between a 404 (HTTP Resource Not Found) and a
500
+++ (HTTP Server Error) error.</p>
+++
+++ <pre> try
+++ {
+++ final String sql = "SELECT mimeType, sourceDate, attachmentData
FROM attachments" +
+++ " WHERE attachmentId = '" + source + "'";
+++
+++ con = datasource.getConnection();
+++ con.setAutoCommit(false);
+++ stm = con.createStatement();
+++ rs = stm.executeQuery(sql);
+++
+++ if (rs.next())
+++ {
+++ m_mimeType = rs.getString(1);
+++ m_validity = new TimeStampValidity(
rs.getTimestamp(2).getTime() );
+++ m_content = rs.getBlob(3).getBinaryStream();
+++ }
+++ else
+++ {
+++ throw new ResourceNotFoundException("Could not find the
record");
+++ }
+++ }
+++ </pre>
+++
+++ <p>If for some reason we catch a <tt>SQLException</tt> from the database,
it is
+++ certainly not expected so we rethrow it wrapped with a general
+++ <tt>ProcessingException</tt>.</p>
+++
+++ <pre> catch (SQLException se)
+++ {
+++ throw new ProcessingException(se);
+++ }
+++ </pre>
+++
+++ <p>Lastly we cleanup our database objects in the finally method. Without
that
+++ we run into database server memory leaks as the database keeps resources
open
+++ for queries on the server side. Even the big name databases are sensitive
to
+++ this. The JDBCDataSourceComponent connection pooling code does cache the
+++ resultsets and statements to make sure they are closed when you close the
+++ connection, but you might want to use a generic J2EEDataSourceComponent
which
+++ may or may not do that for you. Never make assumptions and always clean up
+++ after yourself.</p>
+++
+++ <pre> finally
+++ {
+++ if (rs != null) try{ rs.close(); } catch(SQLException se)
{/*ignore*/}
+++ if (stm != null) try{ stm.close(); } catch(SQLException se)
{/*ignore*/}
+++ if (con != null) try{ con.close(); } catch(SQLException se)
{/*ignore*/}
+++ }
+++ </pre>
+++
+++ <p>The setup is done. Now we just need to let the sitemap know what we
found.
+++ The first thing is to let the sitemap know what kind of attachment we are
+++ sending. As you recall, we stored that in the class field "m_mimeType",
and the
+++ <tt>getMimeType()</tt> method from SitemapOutputComponent informs the
sitemap.
+++ </p>
+++
+++ <pre> @Override
+++ public String getMimeType()
+++ {
+++ return m_mimeType;
+++ }
+++ </pre>
+++
+++ <p>Now we want to let the sitemap know the last modified timestamp for the
+++ attachment. Since we stored this information in the "m_validity" field we
will
+++ send the information from that field. There is a problem though: what if
the
+++ resource was not found? We might get a NullPointerException if the
m_validity
+++ field was never set. Even though the Sitemap shouldn't call this method in
the
+++ event that we couldn't find a resource we still don't want to take any
chances.
+++ A properly guarded <tt>getLastModified()</tt> method would be:</p>
+++
+++ <pre> @Override
+++ public long getLastModified()
+++ {
+++ return (null == m_validity) ? -1L : m_validity.getTimeStamp();
+++ }
+++ </pre>
+++
+++ <h3>The Caching Clues</h3>
+++
+++ <p>Lastly we want to provide the caching information to the CachingPipeline
when
+++ needed. Since our source is an ID (from <tt><map:read
src="{1}"/></tt>)
+++ it is probably the best cache key for our component. Let's just use it:</p>
+++
+++ <pre> public Serializable getKey()
+++ {
+++ return source;
+++ }
+++ </pre>
+++
+++ <p>We stored the TimeStampValidity object when we set up the attachment
+++ information, so let's just give that back. Alternatively you could use an
+++ ExpiresValidity to completely avoid hits to the database altogether--but
for now
+++ this is good enough.</p>
+++
+++ <pre> public SourceValidity getValidity()
+++ {
+++ return m_validity;
+++ }
+++ </pre>
+++
+++ <h3>Sending the Payload</h3>
+++
+++ <p>All this work was done just so we could send the results back to the
client,
+++ and now we get to see the code that does it.</p>
+++
+++ <p class="warn">Don't try to read the entire attachment into memory and then
+++ send it on to the user. It isn't necessary and it kills your scalability.
+++ Instead grab little chunks at a time and send it on to the output stream as
you
+++ get it. You'll find that it feels faster on the client end as well.</p>
+++
+++ <p>The next code snippet is the contents of the <tt>generate() </tt>method
from
+++ the class skeleton above. All we are doing is pulling a little data at a
time
+++ from the database and sending it directly to the user. Wait a minute! I
hear
+++ you shout. What about the connection we just closed in the setup method?
+++ Remember that the connection isn't closed until the pool retires it. You
will
+++ never practically need to worry about the system severing your connection
to the
+++ database mid-stream. Try it. Throw a load test at the system just to make
sure
+++ I'm not smoking some controlled substances. Nevertheless, without much
further
+++ ado, the code:</p>
+++
+++ <pre> public void generate() throws IOException, SAXException,
ProcessingException
+++ {
+++ try
+++ {
+++ byte[] buffer = new byte[BUFFER];
+++ int len = 0;
+++
+++ while ((len = m_content.read(buffer)) >= 0)
+++ {
+++ out.write(buffer, 0, len);
+++ }
+++
+++ out.flush();
+++ }
+++ finally
+++ {
+++ out.close();
+++ m_content.close();
+++ m_content = null;
+++ }
+++ }
+++ </pre>
+++
+++ <p>We close the stream in the finally clause. If there are any exceptions
+++ thrown, they are propogated up without rewrapping them. You may wonder why
we
+++ close the <tt>m_content</tt> stream here and in the <tt>recycle()</tt>
method
+++ above. The answer is assurance. The <tt>generate()</tt> method is only
called
+++ when the resource exists so the content stream won't get closed.
Additionally,
+++ most database drivers tend to wait on all open streams to be closed manually
+++ before the connection with the server is severed. Of course there are
timeout
+++ limits as well, but we don't want to use them if we can avoid it. By
including
+++ the call to close the attachment data stream in the <tt>generate()</tt>
method,
+++ we shorten the amount of time that there might be resources tied up with the
+++ stream.</p>
+++
+++ <h2>Summary</h2>
+++
+++ <p>We're done. It seems like we did a lot here, and that's because we
did. If
+++ we simply did direct generation of the data the class would have been
simpler.
+++ By incorporating a database into the mix we've covered most of the things
you
+++ might be curious about. Things like how to access other components from
your
+++ component, how to make sure our component is cacheable, and some real
gotchas
+++ that you do want to avoid. The example we have here will be very
performant,
+++ and is not too different from Cocoon's DatabaseReader. Of course, by doing
it
+++ ourselves we get to learn a bit more about how things work inside of
Cocoon.</p>
+++
</body>
</html>
Fields
======
no changes
Links
=====
no changes
Custom Fields
=============
no changes
Collections
===========
no changes