
There are a few methods of picking a 3D object from a 2D coordinate.
The ones I know of are (in order of popularity):
1)  gluUnProject to get a ray, then use a collision detection
algorithm to check against your object bounds
2)  Use the color/depth buffer
--  This involves a special little bit of work.
--  When a touch comes in, on the next draw, you render the pickable
objects without textures, lights, etc, each with a unique color.  Keep
your Z-Buffer on and you will have colors that represent the objects
which are ordered front to back.  You then check the framebuffer to
see which color is at the touch point and you will then know your
object.  Erase and do your normal rendering after.
3)  You can implement something like gluUnProject yourself if you
create your modelview matrix look-at in a separate matrix.  It's not
as hard as it sounds.  You can create your vector starting with 0,0,1
then translating based on the touch x,y and field of view, then
finally multiply it by that matrix and you've got it.  I'd rather just
use unproject, though.

As far as loading models and 3D data...
The fact of the matter is that there are so many different file
formats, each with its own tradeoffs.  If you want static geometry,
you can just write an OBJ loader.  It's not very hard and I actually
posted code to my first one:

If you want animations, things get harder.  You have to pick a tool to
use to make your animations, then you have to figure out if that tool
exports a format that you can find or write importing code for.  Good
engines write their own importer/exporters for their own formats.  Why
no universal format?  Different formats allow for different features.
Since each game specializes in some way, it's hard to make it
universal.  It gets worse - you can't just "import into opengl."
Opengl is great, but it deals with primitives.  If you want to
animate, you need to understand some 3D geometry foundation because
you have to know how your animations really work, down to the vertex.

I ended up using Blender/MD2 as my tool and animation format.  You'll
see all of this in our upcoming game next month.  MD2 is nice because
it's very easy on the CPU if you write an interpolator in C and use
fixed point math for it.  The loader took a few days to figure out but
there are docs online that go over the file format in detail.  The
problem with it is that since it doesn't use bones, you can't combine
top/bottom animations and you can't just reskin.  You also can't mount
weapons easily (though we did it with some nasty hacks).

As for levels.. I haven't found an optimal solution yet.  We used
blender and manually partitioned out space into visible areas,
separated by visibility portals.  I wrote a visibility portal
occlusion algorithm so that we only draw the areas that are currently
visible.  We create a separate collision mesh which I load into an
octree (native code is a must for this stuff) and perform collision
tests on (also always in native code).  The advantage to this is that
it uses off-the-shelf tools and formats.  The downside is that since
we don't use a proper level editor, we don't get nice things like
built in entities, surface types, etc.  We had to hack it all in.

How long did all of this take us to learn and do?  I started learning
OpenGL ES last september and really started this game in November.
That puts us around 5 months on this title and we're still a good
month away from being done with it.  Too long for my tastes but it
does take a lot of time to learn how to do all of this stuff so I'm
glad I invested the time to learn.

