On 10/31/15 2:44 AM, Nana Okyere wrote:
> New to python, web development and hence flask. I have a small
> application that accepts excel inputs, extracts the data and writes it
> out oracle. My stack includes flask-sqlalchemy, cx_oracle. When I upload
> a small excel file with about 10 K rows, it works just fine. When I
> throw really large files at it up to a million rows, it craps out and I
> get :
> 
> |
> sqlalchemy.exc.operationalerror:(cx_oracle.operationalerror)ora-03114:notconnected
> to oracle

SQLAlchemy has no timeout feature, this would be cx_oracle itself
dropping the connection.

the code here doesn't show the kind of table we're dealing with, if it
uses large objects (e.g. CLOB, BLOB) that could contribute to an issue
like this, if the value being sent is too large for memory.     Can't
make any determination from the information that's here.





> |
> 
> My view function:
> 
> |
> @theapp.route('/upload', methods=('GET', 'POST'))
> def upload():
>     start_time = time.time()
>     form = UploadForm()
>     if form.validate_on_submit():
>         file_name = secure_filename(form.upload.data.filename)
>        
> form.upload.data.save(os.path.join(theapp.config['UPLOAD_FOLDER'],
> file_name))
>         input_file = os.path.join(theapp.config['UPLOAD_FOLDER'], file_name)
>         wb = load_workbook(input_file, data_only = True, read_only=True,
> use_iterators=True)
>         sheet = wb.worksheets[0]
>         allrows = sheet.iter_rows() #this is a generator
>         headerobject = next(allrows) # row object for headerobject
> 
>         # Column names from excel file
>         headerlist = []
>         for c in headerobject:
>             if c.value is not None: # If user uses delete button to
> delete contents of header cell, ignore it
>                 if isinstance(c.value, str) is False: # if not a string
>                     return bad_request("Column names have to be strings
> without spaces. Value in cell {} is not a string".format(c.coordinate))
>                 if not c.value.strip(): # Empty string in header
>                     return bad_request("Cell {} has line space(s).
> Delete the contents of the cell".format(c.coordinate))
>                 headerlist.append(c.value.strip())
>         #print("Excel {}".format(headerlist))
> 
>         duplicate_elements = [k for k,v in Counter(headerlist).items()
> if v >1]
>         if duplicate_elements: # A falsy. If there are duplicate elements
>             return bad_request("Column(s) {} are
> duplicates".format(duplicate_elements))
> 
>         tableinfo = form.targettableinfo.data # tableinfo object in the
> databse order
>         tablecolumns = []
>         for col in tableinfo.fields:
>             tablecolumns.append(col.name)
>         #print("Table columns {}".format(tablecolumns))
> 
>         # Check if the columns match
>         if not set(headerlist) == set(tablecolumns):
>             foreign_columns = [x for x in headerlist if x not in
> tablecolumns]
>             if foreign_columns:
>                 return bad_request("Column(s) {} in your excel file are
> not defined".format(foreign_columns))
>             unfulfilled_columns = [x for x in tablecolumns if x not in
> headerlist]
>             if unfulfilled_columns:
>                 return bad_request("Column(s) {} are expected by the
> table but not found in your excel file.".format(unfulfilled_columns))
> 
>         # Sort the tablecolumns object in the order of the headerlist
>         sortedtablecolumnsobject = sorted(tableinfo.fields, key=lambda
> x: headerlist.index(x.name))
> 
>         # create a list of dicts to hold data to be written
>         output_data = []
>         for row in allrows:
>             row_dict = {}
>             for fobject,cobject in zip(sortedtablecolumnsobject,row):
>             ######################################
>             ##Validation & assignment
>             ######################################
>                 if fobject.type.name == "Text":
>                     if cobject.value == None:
>                         row_dict[fobject.name] = cobject.value
>                     else:
>                         row_dict[fobject.name] =
> removeNonAscii(str(cobject.value))
>                         #row_dict[fobject.name] =
> cobject.value.encode('utf-8')
>                         #print(str(cobject.value))
>                 elif fobject.type.name == "Number":
>                     if dao.isnumber(cobject.value):
>                         row_dict[fobject.name] = float(cobject.value)
>                         #print(float(cobject.value))
>                     elif cobject.value is None:
>                         row_dict[fobject.name] = cobject.value
>                         #print(cobject.value)
>                     else:
>                         return bad_request("Cell {} has value {}. A
> numeric value is expected".format(cobject.coordinate, cobject.value))
>                 elif fobject.type.name == "Date":
>                     if type(cobject.value) == datetime.datetime:
>                         row_dict[fobject.name] = cobject.value
>                     elif cobject.value is None:
>                         row_dict[fobject.name] = cobject.value
>                     else:
>                         return bad_request("Cell {} does not contain a
> valid date".format(cobject.coordinate))
>             ##############################################################
> 
>             output_data.append(row_dict)
> 
>         #print(output_data)
>         m = db.MetaData()
>         t = db.Table(tableinfo.table_name,m,autoload = True,
> autoload_with = db.engine)
>         db.engine.execute(t.insert(),output_data)
> 
>         print("{} seconds" .format(time.time() - start_time))
> 
>         return redirect(url_for('summary'))
> 
>     return render_template('upload.html', title = 'Upload', form = form)
> |
> 
> Basically, I create a giant list of dictionaries and execute an insert
> on the destination table with that output list. Like I said, it works
> for smaller files. For larger files, I think the connection to the
> database dies before I attempt to do the inserts. My stack trace shows
> that the error originates from the line that does the insertion. Is
> there a way to keep the database connection alive even if the request is
> taking too long? I have experimented with SQLALCHEMY_POOL_RECYCLE and
> SQLALCHEMY_POOL_TIMEOUT configurations in flask-sqlalchemy but it had no
> significance. What can I do to keep my database session/connection
> active during a long request? Thanks
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected]
> <mailto:[email protected]>.
> To post to this group, send email to [email protected]
> <mailto:[email protected]>.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to